You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@geode.apache.org by Amit Pandey <am...@gmail.com> on 2017/01/14 15:01:19 UTC

Re: Load all data from DB on Cache Start

Hey John,

How do we hook up post processors for a region ?

If I have a region like :-

<gfe:partitioned-region id="trades">
    <gfe:cache-loader>
        <bean class="x.y.z.TradeLoader"/>
    </gfe:cache-loader>
    <gfe:cache-writer>
        <bean class="x.y.z.TradeWriter"/>
    </gfe:cache-writer>


</gfe:partitioned-region>


How do we hook up the post processor?


On Tue, Dec 27, 2016 at 1:22 PM, Amit Pandey <am...@gmail.com>
wrote:

> Hey,
>
> Happy Holidays. Wishing you a great new year :)
>
> Regards
>
> On Tue, Dec 27, 2016 at 1:08 PM, John Blum <jb...@pivotal.io> wrote:
>
>> ;-)  Happy holidays my friend.  Hope your are getting some good R&R.
>>
>> On Mon, Dec 26, 2016 at 2:14 PM, Udo Kohlmeyer <uk...@pivotal.io>
>> wrote:
>>
>>> it helps a lot! :D
>>>
>>> On 12/26/16 12:28, John Blum wrote:
>>>
>>> Amit-
>>>
>>> Regarding...
>>>
>>> *> I want to load all data on cache startup at a go.*
>>>
>>> Since you are using "*Spring*", you could easily implement a *Spring*
>>> BeanPostProcessor [1] (BPP) for each (or all the) *Region(s)* in which
>>> you need to load data.  I do this frequently in *Spring Data
>>> GemFire/Geode's* test suite when testing *Region* data access
>>> operations using the GemfireTemplate, *Repositories* or things of that
>>> nature.  Clearly your BPP could use a DataSource to load the data from
>>> an external data store (e.g. RDBMS).
>>>
>>> Another way to do load data on startup is to use a Geode *Initializer*.
>>> However, this would require you to specify a snippet of cache.xml and
>>> does not work if you specify your *Regions* in *Spring* (XML/Java)
>>> config as you should when using *Spring*.  I also don't recommend using
>>> cache.xml, but is the pure, non-*Spring* way to invoke logic after the
>>> cache has been "fully" initialized (i.e. where the *Regions* have been
>>> defined in cache.xml).
>>>
>>> See here [2] for more details.  Note, the documentation talks of
>>> "launching an application" on startup, after cache initialization, but
>>> technically, you can do whatever you want, like load data.
>>>
>>> I recommend the BPP.
>>>
>>>
>>> *> How should I set it up in config to allow it to join other nodes in
>>> cluster?*
>>>
>>> Regardless of whether your server data node is "embedded" or not, you
>>> can still use a Locator, or mcast to have the node join the cluster.  The
>>> "embedded" scenario, where the "application" is a GemFire Server data node
>>> will be part of the cluster as Udo said.
>>>
>>> This is easily achievable with...
>>>
>>> <util:properties id="gemfireProperties">
>>>   <prop key="name">Example</prop>
>>>   <!-- Set to non-zero value to use Multicast; comment out "locators"
>>> -->
>>>   <prop key="*mcast-port*">0</prop>
>>>   <prop key="log-level">${gemfire.log-level:config}</prop>
>>>   <prop key=“*locators*”>someHost[10334]</prop>
>>>   <prop key="start-locator">localhost[1034]</prop>
>>> </util:properties>
>>>
>>> <gfe:cache properties-ref="gemfireProperties"/>
>>>
>>> ...
>>>
>>>
>>> As you can see from the snippet of *Spring* XML config above, this
>>> application is a Geode "peer" cache (i.e. embeds a Geode data node/server).
>>>
>>> The "*locators*" Geode/GemFire property enables this node to connect to
>>> a cluster.  Likewise, you can use the "*mcast-port*" property instead,
>>> however, I would recommend *Locators* over mcast.
>>>
>>> Additionally, you can see that I specified the "start-locator"
>>> Geode/GemFire property, which enables me to start an embedded Locator.
>>> Useful for testing purposes and connecting Geode data nodes together in a
>>> cluster without a dedicated Locator, though, this approach is less
>>> resilient if the applications/servers go down (as may be the case in a
>>> micro-services scenario)!
>>>
>>>
>>> *> if I start with embedded server is it required to use client pool or
>>> is it not required?*
>>>
>>> A "client pool" is only applicable to cache clients (i.e. ClientCaches)
>>> on the "client-side" of the equation.  "peers" find (Locator, mcast) and
>>> communicate (TCP/UDP, JGroups) with each other through other means once a
>>> cluster is formed.
>>>
>>> In fact, typically, it is more common to position your
>>> microservices-based applications as Geode cache clients (i.e. <gfe:client-cache
>>> ...>) and have them connect to a dedicated Geode service (i.e. cluster
>>> of Geode servers/data nodes where also, 1 or more of those nodes are
>>> running a "CacheServer", listening for cache clients to connect).
>>> These dedicated Geode server nodes in a cluster constituting the service
>>> can still be configured with *Spring*, but they typically will not
>>> contain an application-specific components other than CacheListeners,
>>> Loaders, Writers, AEQ *Listeners*, etc.
>>>
>>> ClientCache applications use 1 or more Pools configured to talk to the
>>> servers in the cluster (either by way of Locator or direct server
>>> communication). Pools can be configured with groups to target specific
>>> members (in that group) in the cluster.  Typically, members in 1 group host
>>> a different set of Regions from another group and is a way to separate data
>>> traffic from 1 client to another dedicated to a specific resource/purpose
>>> (usually based on business function, etc).
>>>
>>> On a side note, some of what you are wanting to do "scale-wise" seems
>>> like a perfect fit for Pivotal CloudFoundry, which can auto-scale up or
>>> down nodes in your cluster based on load and other factors.
>>>
>>> Anyway, hope this helps!
>>>
>>> -John
>>>
>>>
>>>
>>>
>>>
>>> [1] http://docs.spring.io/spring/docs/current/spring-framewo
>>> rk-reference/htmlsingle/#beans-factory-extension-bpp
>>> [2] http://geode.apache.org/docs/guide/basic_config/the_cach
>>> e/setting_cache_initializer.html
>>>
>>>
>>> On Sun, Dec 25, 2016 at 11:12 PM, Amit Pandey <amit.pandey2103@gmail.com
>>> > wrote:
>>>
>>>> Hey,
>>>>
>>>> Thanks.
>>>>
>>>> I have lots of reference data which will be loaded at start of day.
>>>> This data is not bound to change much and as such I want to keep it loaded
>>>> at the start of day. Read through will make it slow while it is being
>>>> actually accessed so I want to keep it loaded in memory.
>>>>
>>>> Also I want to have functions which will be called by clients to do
>>>> some compute and return results. Using functions should allow me to add
>>>> nodes and speed up the compute.
>>>>
>>>> I have some micro services each of which will start a gemfire node, and
>>>> I want to connect, so yes I can set it up with locator.
>>>>
>>>> However I have one doubt, if I start with embedded server is it
>>>> required to use client pool or is it not required?
>>>>
>>>> Regards
>>>>
>>>> On Mon, Dec 26, 2016 at 1:18 AM, Udo Kohlmeyer <uk...@pivotal.io>
>>>> wrote:
>>>>
>>>>> Hi there Amit,
>>>>>
>>>>> At this stage the only way you could load all data at one go is to
>>>>> write a client to connect to the db and load all in. Another approach could
>>>>> be to write the same code into a function and invoke the function at start
>>>>> up. But in both cases both are manual.
>>>>>
>>>>> To have geode servers join a cluster, you have 2 ways.
>>>>>
>>>>>    1. Connecting them up via a locator
>>>>>    2. Connecting them up via mcast.
>>>>>
>>>>> Please be aware the once you connect a server to a cluster, that
>>>>> server becomes an integral part of the cluster so adding/removing servers
>>>>> from a cluster is not something you'd want to do in a load-based scaling
>>>>> model. i.e if the load is high, add a server and if load is low, shut down
>>>>> a server.
>>>>>
>>>>> Just interest sake, what is your use case.
>>>>>
>>>>> --Udo
>>>>>
>>>>> On 12/24/16 05:57, Amit Pandey wrote:
>>>>>
>>>>> Hi Guys,
>>>>>
>>>>> I am using Spring Data Geode. I have been able to use read and write
>>>>> through/ write behind. I want to load all data on cache startup at a go.
>>>>>
>>>>> Secondly my geode server is embedded but I want to allow it join to
>>>>> other nodes.  How should I set it up in config to allow it to join other
>>>>> nodes in cluster?
>>>>>
>>>>> Regards
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> -John
>>> john.blum10101 (skype)
>>>
>>>
>>>
>>
>>
>> --
>> -John
>> john.blum10101 (skype)
>>
>
>

Re: Load all data from DB on Cache Start

Posted by Amit Pandey <am...@gmail.com>.

John,

Thanks, I got it working. Particularly thanks for the tip on not autowiring
any thing, as I would have made that mistake :)

Regards

On Sun, Jan 15, 2017 at 3:39 AM, Luke Shannon <ls...@pivotal.io> wrote:

> Great points John. Lots of gems in those Geode tips you just gave :-D
>
> On Jan 14, 2017 4:39 PM, "John Blum" <jb...@pivotal.io> wrote:
>
>> Amit-
>>
>> Another thing, a BPP is my recommended way in *Spring* to load data into
>> a Region after initialization, so I whole heartily support Luke on this.
>>
>> Also keep in mind, if you need the initial Region load to be done
>> asynchronously (a BPP callback method is invoked synchronously during a
>> *Spring* ApplicationContext refresh and will block all other (possible)
>> beans (coming after) from being initialized), then you are responsible for
>> making that happen... perhaps with an appropriate Executor and Future.
>> Keep in mind that you can also publish (fire) an ApplicationEvent to
>> your "interested" application components (beans) that need to know when the
>> Region is fully loaded and ready for use.
>>
>> Additionally, if you do not need to preload your Region on startup, then
>> a CacheLoader is the recommended way to load data into your Region on
>> cache misses (another synchronous mechanism called a "read-through").
>>
>> A word of caution, never, ever auto-wire or inject any beans into a BPP.
>> To do so could cause premature initialization.  Always rely on the bean
>> instance passed to the BPPs postProcessXXXX methods.
>>
>> Thanks,
>> John
>>
>>
>> On Sat, Jan 14, 2017 at 1:30 PM, John Blum <jb...@pivotal.io> wrote:
>>
>>> Hi Amit, Luke-
>>>
>>> Thank you Luke.
>>>
>>> Actually Luke is mostly correct.  In this case, the order, however, DOES
>>> NOT matter.  The *Spring* container is intimately aware of certain
>>> types of beans defined/declared in the *Spring* ApplicationContext.
>>> BeanPostProcessors, a container extension point (hook), are one of them.
>>>
>>> *Spring* creates all BeanPostProcessors (BPP) before any other
>>> application beans in order to post process each bean defined/declared in
>>> the container (except for BPPs and BeanFactoryPostProcessors, of
>>> course).  The container then proceeds to call the BPP *before* the bean
>>> is "initialized" by the container (i.e. postProcessBeforeInitializatio
>>> n(..)) as well as *after* the bean has been "initialized".  A bean
>>> initialization corresponds to InitializingBean.afterPropertiesSet(),
>>> any init() methods marked as such in XML config or any @PostContruct
>>> methods.
>>>
>>> Most SDG FactoryBeans (e.g. PartitionedRegionFactoryBean ->
>>> RegionFactoryBean) always create their GemFire object (e.g. Region) in
>>> the afterPropertiesSet() (i.e. initialization) method as the
>>> <SDG>FactoryBean implements *Spring's* InitializingBean (callback)
>>> interface.
>>>
>>> Therefore, technically, it is safe to define/declare any beans, in any
>>> order, since the dependencies and callbacks (BPP) pretty much determine the
>>> order in which beans are constructed, configured and initialized.  SDG even
>>> takes the Spring container DI concept to the level of ensure GemFire
>>> objects are created in the order that GemFire expects based on both
>>> explicit and implicit dependencies (think Regions and a DiskStore, for
>>> instance, where the DS is just named in the Region configuration;
>>> under-the-hood, though, SDG creates a RuntimeReference on the named DS
>>> to ensure the proper order).  Another example would be, it is also possible
>>> to defined/declare your Regions before a the Cache instance...
>>>
>>> <gfe:partitioned-region id="Products" ... />
>>>
>>> <gfe:cache/>
>>>
>>> SDG does not care how your define yours beans generally will do the
>>> right thing.  Using JavaConfig is a bit different though and in certain
>>> cases you have be a bit more conscientious of the order.
>>>
>>> In general, if you had a container with multiple beans defined/declared
>>> that had NO dependencies between them (or other pre-defined order
>>> specified, such as when using *Spring's* @Ordered annotation in an
>>> AnnotationBasedApplicationContext or by implementing the Ordered
>>> interface), then *Spring* will pretty much proceed to construct,
>>> configure and initialize beans in the order they are declared in the
>>> ApplicationContext config.
>>>
>>> Now, if you have multiple BPPs to process the Region, for various
>>> reasons, then you will need to define order among them by using the
>>> @Ordered annotation or by having your custom BPP implement the Ordered
>>> interface, if order is important.  If an order is not given, then
>>> *Spring* makes no guarantees which BPP will be invoked first.
>>>
>>> Anyway, all of this is well-described in the Spring documentation on "*Customizing
>>> the nature of a bean*" [1] as well as in "Container Extension Points"
>>> [2].
>>>
>>> Hope this helps.
>>>
>>> -John
>>>
>>> [1] http://docs.spring.io/spring/docs/current/spring-framewo
>>> rk-reference/htmlsingle/#beans-factory-nature
>>> [2] http://docs.spring.io/spring/docs/current/spring-framewo
>>> rk-reference/htmlsingle/#beans-factory-extension
>>>
>>>
>>> On Sat, Jan 14, 2017 at 8:38 AM, Amit Pandey <am...@gmail.com>
>>> wrote:
>>>
>>>> Okay...yea as post processors process everything in the IOC thats the
>>>> only way I guess
>>>>
>>>> Thanks
>>>>
>>>>
>>>>
>>>> On Sat, Jan 14, 2017 at 9:36 PM, Luke Shannon <ls...@pivotal.io>
>>>> wrote:
>>>>
>>>>> Hi Amit,
>>>>>
>>>>> In the past I have done it like this:
>>>>>
>>>>> Define a BeanPostProcessor like below. It will go out and get the data
>>>>> from where ever it lives, convert it to objects and then put them into the
>>>>> region using a Region reference passed in shortly after the region is
>>>>> initialized. This bean will need to be in the class path of Geode when it
>>>>> start up. If using gfsh you can add it to the '--classpath' argument of the
>>>>> 'start server' command.
>>>>>
>>>>> You can then wire this bean into the Geode Cache xml like so:
>>>>>
>>>>> <gfe:replicated-region id="Product" />
>>>>>
>>>>> <bean id="productLoader" class="mypackage.ProductLoader">
>>>>>
>>>>> <property name="targetBeanName" value="Product" />
>>>>>
>>>>> </bean>
>>>>>
>>>>> Note that this bean is placed *below* your region definitions in the
>>>>> spring cache xml. If I remember correctly order matters and it will try and
>>>>> run this before the Region reference is created if the order is not correct.
>>>>>
>>>>> Hope this helps,
>>>>>
>>>>> Luke
>>>>>
>>>>> import java.io.BufferedReader;
>>>>> import java.io.File;
>>>>> import java.io.FileReader;
>>>>> import java.io.IOException;
>>>>> import java.util.HashMap;
>>>>> import java.util.Map;
>>>>> import org.springframework.beans.BeansException;
>>>>> import org.springframework.beans.factory.config.BeanPostProcessor;
>>>>> import org.springframework.util.Assert;
>>>>> import org.springframework.util.StringUtils;
>>>>> import com.gemstone.gemfire.cache.Region;
>>>>> import com.google.gson.Gson;
>>>>>
>>>>>
>>>>> public class ProductLoader implements BeanPostProcessor {
>>>>>
>>>>> private String targetBeanName;
>>>>> protected String getTargetBeanName() {
>>>>>    Assert.state(StringUtils.hasText(targetBeanName), "The target
>>>>> Spring context bean name was not properly specified!");
>>>>>    return targetBeanName;
>>>>>  }
>>>>>
>>>>>  public void setTargetBeanName(final String targetBeanName) {
>>>>>    Assert.hasText(targetBeanName, "The target Spring context bean
>>>>> name must be specified!");
>>>>>    this.targetBeanName = targetBeanName;
>>>>>  }
>>>>>
>>>>>  @Override
>>>>>  public Object postProcessBeforeInitialization(final Object bean,
>>>>> final String beanName) throws BeansException {
>>>>>    return bean;
>>>>>  }
>>>>>
>>>>> @SuppressWarnings({ "unchecked", "rawtypes" })
>>>>> @Override
>>>>>  public Object postProcessAfterInitialization(final Object bean,
>>>>> final String beanName) throws BeansException {
>>>>>    if (beanName.equals(getTargetBeanName()) && bean instanceof
>>>>> Region) {
>>>>>           //get your data from where it lives and do a put or a put
>>>>> all into the region here
>>>>> ((Region) bean).put(<Key For Product>,<Product Value>);
>>>>>    log.info("Preloading complete. Region now has: " + ((Region)
>>>>> bean).size());
>>>>>    }
>>>>>    return bean;
>>>>>  }
>>>>>
>>>>>
>>>>>
>>>>> }
>>>>>
>>>>>
>>>>> On Sat, Jan 14, 2017 at 10:01 AM, Amit Pandey <
>>>>> amit.pandey2103@gmail.com> wrote:
>>>>>
>>>>>> Hey John,
>>>>>>
>>>>>> How do we hook up post processors for a region ?
>>>>>>
>>>>>> If I have a region like :-
>>>>>>
>>>>>> <gfe:partitioned-region id="trades">
>>>>>>     <gfe:cache-loader>
>>>>>>         <bean class="x.y.z.TradeLoader"/>
>>>>>>     </gfe:cache-loader>
>>>>>>     <gfe:cache-writer>
>>>>>>         <bean class="x.y.z.TradeWriter"/>
>>>>>>     </gfe:cache-writer>
>>>>>>
>>>>>>
>>>>>> </gfe:partitioned-region>
>>>>>>
>>>>>>
>>>>>> How do we hook up the post processor?
>>>>>>
>>>>>>
>>>>>> On Tue, Dec 27, 2016 at 1:22 PM, Amit Pandey <
>>>>>> amit.pandey2103@gmail.com> wrote:
>>>>>>
>>>>>>> Hey,
>>>>>>>
>>>>>>> Happy Holidays. Wishing you a great new year :)
>>>>>>>
>>>>>>> Regards
>>>>>>>
>>>>>>> On Tue, Dec 27, 2016 at 1:08 PM, John Blum <jb...@pivotal.io> wrote:
>>>>>>>
>>>>>>>> ;-)  Happy holidays my friend.  Hope your are getting some good R&R.
>>>>>>>>
>>>>>>>> On Mon, Dec 26, 2016 at 2:14 PM, Udo Kohlmeyer <
>>>>>>>> ukohlmeyer@pivotal.io> wrote:
>>>>>>>>
>>>>>>>>> it helps a lot! :D
>>>>>>>>>
>>>>>>>>> On 12/26/16 12:28, John Blum wrote:
>>>>>>>>>
>>>>>>>>> Amit-
>>>>>>>>>
>>>>>>>>> Regarding...
>>>>>>>>>
>>>>>>>>> *> I want to load all data on cache startup at a go.*
>>>>>>>>>
>>>>>>>>> Since you are using "*Spring*", you could easily implement a
>>>>>>>>> *Spring* BeanPostProcessor [1] (BPP) for each (or all the)
>>>>>>>>> *Region(s)* in which you need to load data.  I do this frequently
>>>>>>>>> in *Spring Data GemFire/Geode's* test suite when testing *Region*
>>>>>>>>> data access operations using the GemfireTemplate, *Repositories*
>>>>>>>>> or things of that nature.  Clearly your BPP could use a DataSource
>>>>>>>>> to load the data from an external data store (e.g. RDBMS).
>>>>>>>>>
>>>>>>>>> Another way to do load data on startup is to use a Geode
>>>>>>>>> *Initializer*.  However, this would require you to specify a
>>>>>>>>> snippet of cache.xml and does not work if you specify your
>>>>>>>>> *Regions* in *Spring* (XML/Java) config as you should when using
>>>>>>>>> *Spring*.  I also don't recommend using cache.xml, but is the
>>>>>>>>> pure, non-*Spring* way to invoke logic after the cache has been
>>>>>>>>> "fully" initialized (i.e. where the *Regions* have been defined
>>>>>>>>> in cache.xml).
>>>>>>>>>
>>>>>>>>> See here [2] for more details.  Note, the documentation talks of
>>>>>>>>> "launching an application" on startup, after cache initialization, but
>>>>>>>>> technically, you can do whatever you want, like load data.
>>>>>>>>>
>>>>>>>>> I recommend the BPP.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> *> How should I set it up in config to allow it to join other
>>>>>>>>> nodes in cluster?*
>>>>>>>>>
>>>>>>>>> Regardless of whether your server data node is "embedded" or not,
>>>>>>>>> you can still use a Locator, or mcast to have the node join the cluster.
>>>>>>>>> The "embedded" scenario, where the "application" is a GemFire Server data
>>>>>>>>> node will be part of the cluster as Udo said.
>>>>>>>>>
>>>>>>>>> This is easily achievable with...
>>>>>>>>>
>>>>>>>>> <util:properties id="gemfireProperties">
>>>>>>>>>   <prop key="name">Example</prop>
>>>>>>>>>   <!-- Set to non-zero value to use Multicast; comment out
>>>>>>>>> "locators" -->
>>>>>>>>>   <prop key="*mcast-port*">0</prop>
>>>>>>>>>   <prop key="log-level">${gemfire.log-level:config}</prop>
>>>>>>>>>   <prop key=“*locators*”>someHost[10334]</prop>
>>>>>>>>>   <prop key="start-locator">localhost[1034]</prop>
>>>>>>>>> </util:properties>
>>>>>>>>>
>>>>>>>>> <gfe:cache properties-ref="gemfireProperties"/>
>>>>>>>>>
>>>>>>>>> ...
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> As you can see from the snippet of *Spring* XML config above,
>>>>>>>>> this application is a Geode "peer" cache (i.e. embeds a Geode data
>>>>>>>>> node/server).
>>>>>>>>>
>>>>>>>>> The "*locators*" Geode/GemFire property enables this node to
>>>>>>>>> connect to a cluster.  Likewise, you can use the "*mcast-port*"
>>>>>>>>> property instead, however, I would recommend *Locators* over
>>>>>>>>> mcast.
>>>>>>>>>
>>>>>>>>> Additionally, you can see that I specified the "start-locator"
>>>>>>>>> Geode/GemFire property, which enables me to start an embedded Locator.
>>>>>>>>> Useful for testing purposes and connecting Geode data nodes together in a
>>>>>>>>> cluster without a dedicated Locator, though, this approach is less
>>>>>>>>> resilient if the applications/servers go down (as may be the case in a
>>>>>>>>> micro-services scenario)!
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> *> if I start with embedded server is it required to use client
>>>>>>>>> pool or is it not required?*
>>>>>>>>>
>>>>>>>>> A "client pool" is only applicable to cache clients (i.e.
>>>>>>>>> ClientCaches) on the "client-side" of the equation.  "peers" find
>>>>>>>>> (Locator, mcast) and communicate (TCP/UDP, JGroups) with each other through
>>>>>>>>> other means once a cluster is formed.
>>>>>>>>>
>>>>>>>>> In fact, typically, it is more common to position your
>>>>>>>>> microservices-based applications as Geode cache clients (i.e. <gfe:client-cache
>>>>>>>>> ...>) and have them connect to a dedicated Geode service (i.e.
>>>>>>>>> cluster of Geode servers/data nodes where also, 1 or more of those nodes
>>>>>>>>> are running a "CacheServer", listening for cache clients to
>>>>>>>>> connect).  These dedicated Geode server nodes in a cluster constituting the
>>>>>>>>> service can still be configured with *Spring*, but they typically
>>>>>>>>> will not contain an application-specific components other than
>>>>>>>>> CacheListeners, Loaders, Writers, AEQ *Listeners*, etc.
>>>>>>>>>
>>>>>>>>> ClientCache applications use 1 or more Pools configured to talk
>>>>>>>>> to the servers in the cluster (either by way of Locator or direct server
>>>>>>>>> communication). Pools can be configured with groups to target
>>>>>>>>> specific members (in that group) in the cluster.  Typically, members in 1
>>>>>>>>> group host a different set of Regions from another group and is a way to
>>>>>>>>> separate data traffic from 1 client to another dedicated to a specific
>>>>>>>>> resource/purpose (usually based on business function, etc).
>>>>>>>>>
>>>>>>>>> On a side note, some of what you are wanting to do "scale-wise"
>>>>>>>>> seems like a perfect fit for Pivotal CloudFoundry, which can auto-scale up
>>>>>>>>> or down nodes in your cluster based on load and other factors.
>>>>>>>>>
>>>>>>>>> Anyway, hope this helps!
>>>>>>>>>
>>>>>>>>> -John
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> [1] http://docs.spring.io/spring/docs/current/spring-framewo
>>>>>>>>> rk-reference/htmlsingle/#beans-factory-extension-bpp
>>>>>>>>> [2] http://geode.apache.org/docs/guide/basic_config/the_cach
>>>>>>>>> e/setting_cache_initializer.html
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sun, Dec 25, 2016 at 11:12 PM, Amit Pandey <
>>>>>>>>> amit.pandey2103@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hey,
>>>>>>>>>>
>>>>>>>>>> Thanks.
>>>>>>>>>>
>>>>>>>>>> I have lots of reference data which will be loaded at start of
>>>>>>>>>> day. This data is not bound to change much and as such I want to keep it
>>>>>>>>>> loaded at the start of day. Read through will make it slow while it is
>>>>>>>>>> being actually accessed so I want to keep it loaded in memory.
>>>>>>>>>>
>>>>>>>>>> Also I want to have functions which will be called by clients to
>>>>>>>>>> do some compute and return results. Using functions should allow me to add
>>>>>>>>>> nodes and speed up the compute.
>>>>>>>>>>
>>>>>>>>>> I have some micro services each of which will start a gemfire
>>>>>>>>>> node, and I want to connect, so yes I can set it up with locator.
>>>>>>>>>>
>>>>>>>>>> However I have one doubt, if I start with embedded server is it
>>>>>>>>>> required to use client pool or is it not required?
>>>>>>>>>>
>>>>>>>>>> Regards
>>>>>>>>>>
>>>>>>>>>> On Mon, Dec 26, 2016 at 1:18 AM, Udo Kohlmeyer <
>>>>>>>>>> ukohlmeyer@pivotal.io> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi there Amit,
>>>>>>>>>>>
>>>>>>>>>>> At this stage the only way you could load all data at one go is
>>>>>>>>>>> to write a client to connect to the db and load all in. Another approach
>>>>>>>>>>> could be to write the same code into a function and invoke the function at
>>>>>>>>>>> start up. But in both cases both are manual.
>>>>>>>>>>>
>>>>>>>>>>> To have geode servers join a cluster, you have 2 ways.
>>>>>>>>>>>
>>>>>>>>>>>    1. Connecting them up via a locator
>>>>>>>>>>>    2. Connecting them up via mcast.
>>>>>>>>>>>
>>>>>>>>>>> Please be aware the once you connect a server to a cluster, that
>>>>>>>>>>> server becomes an integral part of the cluster so adding/removing servers
>>>>>>>>>>> from a cluster is not something you'd want to do in a load-based scaling
>>>>>>>>>>> model. i.e if the load is high, add a server and if load is low, shut down
>>>>>>>>>>> a server.
>>>>>>>>>>>
>>>>>>>>>>> Just interest sake, what is your use case.
>>>>>>>>>>>
>>>>>>>>>>> --Udo
>>>>>>>>>>>
>>>>>>>>>>> On 12/24/16 05:57, Amit Pandey wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi Guys,
>>>>>>>>>>>
>>>>>>>>>>> I am using Spring Data Geode. I have been able to use read and
>>>>>>>>>>> write through/ write behind. I want to load all data on cache startup at a
>>>>>>>>>>> go.
>>>>>>>>>>>
>>>>>>>>>>> Secondly my geode server is embedded but I want to allow it join
>>>>>>>>>>> to other nodes.  How should I set it up in config to allow it to join other
>>>>>>>>>>> nodes in cluster?
>>>>>>>>>>>
>>>>>>>>>>> Regards
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> -John
>>>>>>>>> john.blum10101 (skype)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> -John
>>>>>>>> john.blum10101 (skype)
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Luke Shannon | Platform Engineering | Pivotal
>>>>> ------------------------------------------------------------
>>>>> -------------
>>>>>
>>>>> Mobile:416-571-9495 <(416)%20571-9495>
>>>>> Join the Toronto Pivotal Usergroup: http://www.meetup.c
>>>>> om/Toronto-Pivotal-User-Group/
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> -John
>>> john.blum10101 (skype)
>>>
>>
>>
>>
>> --
>> -John
>> john.blum10101 (skype)
>>
>

Re: Load all data from DB on Cache Start

Posted by Luke Shannon <ls...@pivotal.io>.

Great points John. Lots of gems in those Geode tips you just gave :-D

On Jan 14, 2017 4:39 PM, "John Blum" <jb...@pivotal.io> wrote:

> Amit-
>
> Another thing, a BPP is my recommended way in *Spring* to load data into
> a Region after initialization, so I whole heartily support Luke on this.
>
> Also keep in mind, if you need the initial Region load to be done
> asynchronously (a BPP callback method is invoked synchronously during a
> *Spring* ApplicationContext refresh and will block all other (possible)
> beans (coming after) from being initialized), then you are responsible for
> making that happen... perhaps with an appropriate Executor and Future.
> Keep in mind that you can also publish (fire) an ApplicationEvent to your
> "interested" application components (beans) that need to know when the
> Region is fully loaded and ready for use.
>
> Additionally, if you do not need to preload your Region on startup, then a
> CacheLoader is the recommended way to load data into your Region on cache
> misses (another synchronous mechanism called a "read-through").
>
> A word of caution, never, ever auto-wire or inject any beans into a BPP.
> To do so could cause premature initialization.  Always rely on the bean
> instance passed to the BPPs postProcessXXXX methods.
>
> Thanks,
> John
>
>
> On Sat, Jan 14, 2017 at 1:30 PM, John Blum <jb...@pivotal.io> wrote:
>
>> Hi Amit, Luke-
>>
>> Thank you Luke.
>>
>> Actually Luke is mostly correct.  In this case, the order, however, DOES
>> NOT matter.  The *Spring* container is intimately aware of certain types
>> of beans defined/declared in the *Spring* ApplicationContext.
>> BeanPostProcessors, a container extension point (hook), are one of them.
>>
>> *Spring* creates all BeanPostProcessors (BPP) before any other
>> application beans in order to post process each bean defined/declared in
>> the container (except for BPPs and BeanFactoryPostProcessors, of
>> course).  The container then proceeds to call the BPP *before* the bean
>> is "initialized" by the container (i.e. postProcessBeforeInitializatio
>> n(..)) as well as *after* the bean has been "initialized".  A bean
>> initialization corresponds to InitializingBean.afterPropertiesSet(), any
>> init() methods marked as such in XML config or any @PostContruct methods.
>>
>> Most SDG FactoryBeans (e.g. PartitionedRegionFactoryBean ->
>> RegionFactoryBean) always create their GemFire object (e.g. Region) in
>> the afterPropertiesSet() (i.e. initialization) method as the
>> <SDG>FactoryBean implements *Spring's* InitializingBean (callback)
>> interface.
>>
>> Therefore, technically, it is safe to define/declare any beans, in any
>> order, since the dependencies and callbacks (BPP) pretty much determine the
>> order in which beans are constructed, configured and initialized.  SDG even
>> takes the Spring container DI concept to the level of ensure GemFire
>> objects are created in the order that GemFire expects based on both
>> explicit and implicit dependencies (think Regions and a DiskStore, for
>> instance, where the DS is just named in the Region configuration;
>> under-the-hood, though, SDG creates a RuntimeReference on the named DS
>> to ensure the proper order).  Another example would be, it is also possible
>> to defined/declare your Regions before a the Cache instance...
>>
>> <gfe:partitioned-region id="Products" ... />
>>
>> <gfe:cache/>
>>
>> SDG does not care how your define yours beans generally will do the right
>> thing.  Using JavaConfig is a bit different though and in certain cases you
>> have be a bit more conscientious of the order.
>>
>> In general, if you had a container with multiple beans defined/declared
>> that had NO dependencies between them (or other pre-defined order
>> specified, such as when using *Spring's* @Ordered annotation in an
>> AnnotationBasedApplicationContext or by implementing the Ordered
>> interface), then *Spring* will pretty much proceed to construct,
>> configure and initialize beans in the order they are declared in the
>> ApplicationContext config.
>>
>> Now, if you have multiple BPPs to process the Region, for various
>> reasons, then you will need to define order among them by using the
>> @Ordered annotation or by having your custom BPP implement the Ordered
>> interface, if order is important.  If an order is not given, then
>> *Spring* makes no guarantees which BPP will be invoked first.
>>
>> Anyway, all of this is well-described in the Spring documentation on "*Customizing
>> the nature of a bean*" [1] as well as in "Container Extension Points"
>> [2].
>>
>> Hope this helps.
>>
>> -John
>>
>> [1] http://docs.spring.io/spring/docs/current/spring-framewo
>> rk-reference/htmlsingle/#beans-factory-nature
>> [2] http://docs.spring.io/spring/docs/current/spring-framewo
>> rk-reference/htmlsingle/#beans-factory-extension
>>
>>
>> On Sat, Jan 14, 2017 at 8:38 AM, Amit Pandey <am...@gmail.com>
>> wrote:
>>
>>> Okay...yea as post processors process everything in the IOC thats the
>>> only way I guess
>>>
>>> Thanks
>>>
>>>
>>>
>>> On Sat, Jan 14, 2017 at 9:36 PM, Luke Shannon <ls...@pivotal.io>
>>> wrote:
>>>
>>>> Hi Amit,
>>>>
>>>> In the past I have done it like this:
>>>>
>>>> Define a BeanPostProcessor like below. It will go out and get the data
>>>> from where ever it lives, convert it to objects and then put them into the
>>>> region using a Region reference passed in shortly after the region is
>>>> initialized. This bean will need to be in the class path of Geode when it
>>>> start up. If using gfsh you can add it to the '--classpath' argument of the
>>>> 'start server' command.
>>>>
>>>> You can then wire this bean into the Geode Cache xml like so:
>>>>
>>>> <gfe:replicated-region id="Product" />
>>>>
>>>> <bean id="productLoader" class="mypackage.ProductLoader">
>>>>
>>>> <property name="targetBeanName" value="Product" />
>>>>
>>>> </bean>
>>>>
>>>> Note that this bean is placed *below* your region definitions in the
>>>> spring cache xml. If I remember correctly order matters and it will try and
>>>> run this before the Region reference is created if the order is not correct.
>>>>
>>>> Hope this helps,
>>>>
>>>> Luke
>>>>
>>>> import java.io.BufferedReader;
>>>> import java.io.File;
>>>> import java.io.FileReader;
>>>> import java.io.IOException;
>>>> import java.util.HashMap;
>>>> import java.util.Map;
>>>> import org.springframework.beans.BeansException;
>>>> import org.springframework.beans.factory.config.BeanPostProcessor;
>>>> import org.springframework.util.Assert;
>>>> import org.springframework.util.StringUtils;
>>>> import com.gemstone.gemfire.cache.Region;
>>>> import com.google.gson.Gson;
>>>>
>>>>
>>>> public class ProductLoader implements BeanPostProcessor {
>>>>
>>>> private String targetBeanName;
>>>> protected String getTargetBeanName() {
>>>>    Assert.state(StringUtils.hasText(targetBeanName), "The target
>>>> Spring context bean name was not properly specified!");
>>>>    return targetBeanName;
>>>>  }
>>>>
>>>>  public void setTargetBeanName(final String targetBeanName) {
>>>>    Assert.hasText(targetBeanName, "The target Spring context bean name
>>>> must be specified!");
>>>>    this.targetBeanName = targetBeanName;
>>>>  }
>>>>
>>>>  @Override
>>>>  public Object postProcessBeforeInitialization(final Object bean,
>>>> final String beanName) throws BeansException {
>>>>    return bean;
>>>>  }
>>>>
>>>> @SuppressWarnings({ "unchecked", "rawtypes" })
>>>> @Override
>>>>  public Object postProcessAfterInitialization(final Object bean, final
>>>> String beanName) throws BeansException {
>>>>    if (beanName.equals(getTargetBeanName()) && bean instanceof Region)
>>>> {
>>>>           //get your data from where it lives and do a put or a put all
>>>> into the region here
>>>> ((Region) bean).put(<Key For Product>,<Product Value>);
>>>>    log.info("Preloading complete. Region now has: " + ((Region)
>>>> bean).size());
>>>>    }
>>>>    return bean;
>>>>  }
>>>>
>>>>
>>>>
>>>> }
>>>>
>>>>
>>>> On Sat, Jan 14, 2017 at 10:01 AM, Amit Pandey <
>>>> amit.pandey2103@gmail.com> wrote:
>>>>
>>>>> Hey John,
>>>>>
>>>>> How do we hook up post processors for a region ?
>>>>>
>>>>> If I have a region like :-
>>>>>
>>>>> <gfe:partitioned-region id="trades">
>>>>>     <gfe:cache-loader>
>>>>>         <bean class="x.y.z.TradeLoader"/>
>>>>>     </gfe:cache-loader>
>>>>>     <gfe:cache-writer>
>>>>>         <bean class="x.y.z.TradeWriter"/>
>>>>>     </gfe:cache-writer>
>>>>>
>>>>>
>>>>> </gfe:partitioned-region>
>>>>>
>>>>>
>>>>> How do we hook up the post processor?
>>>>>
>>>>>
>>>>> On Tue, Dec 27, 2016 at 1:22 PM, Amit Pandey <
>>>>> amit.pandey2103@gmail.com> wrote:
>>>>>
>>>>>> Hey,
>>>>>>
>>>>>> Happy Holidays. Wishing you a great new year :)
>>>>>>
>>>>>> Regards
>>>>>>
>>>>>> On Tue, Dec 27, 2016 at 1:08 PM, John Blum <jb...@pivotal.io> wrote:
>>>>>>
>>>>>>> ;-)  Happy holidays my friend.  Hope your are getting some good R&R.
>>>>>>>
>>>>>>> On Mon, Dec 26, 2016 at 2:14 PM, Udo Kohlmeyer <
>>>>>>> ukohlmeyer@pivotal.io> wrote:
>>>>>>>
>>>>>>>> it helps a lot! :D
>>>>>>>>
>>>>>>>> On 12/26/16 12:28, John Blum wrote:
>>>>>>>>
>>>>>>>> Amit-
>>>>>>>>
>>>>>>>> Regarding...
>>>>>>>>
>>>>>>>> *> I want to load all data on cache startup at a go.*
>>>>>>>>
>>>>>>>> Since you are using "*Spring*", you could easily implement a
>>>>>>>> *Spring* BeanPostProcessor [1] (BPP) for each (or all the)
>>>>>>>> *Region(s)* in which you need to load data.  I do this frequently
>>>>>>>> in *Spring Data GemFire/Geode's* test suite when testing *Region*
>>>>>>>> data access operations using the GemfireTemplate, *Repositories*
>>>>>>>> or things of that nature.  Clearly your BPP could use a DataSource
>>>>>>>> to load the data from an external data store (e.g. RDBMS).
>>>>>>>>
>>>>>>>> Another way to do load data on startup is to use a Geode
>>>>>>>> *Initializer*.  However, this would require you to specify a
>>>>>>>> snippet of cache.xml and does not work if you specify your
>>>>>>>> *Regions* in *Spring* (XML/Java) config as you should when using
>>>>>>>> *Spring*.  I also don't recommend using cache.xml, but is the
>>>>>>>> pure, non-*Spring* way to invoke logic after the cache has been
>>>>>>>> "fully" initialized (i.e. where the *Regions* have been defined in
>>>>>>>> cache.xml).
>>>>>>>>
>>>>>>>> See here [2] for more details.  Note, the documentation talks of
>>>>>>>> "launching an application" on startup, after cache initialization, but
>>>>>>>> technically, you can do whatever you want, like load data.
>>>>>>>>
>>>>>>>> I recommend the BPP.
>>>>>>>>
>>>>>>>>
>>>>>>>> *> How should I set it up in config to allow it to join other nodes
>>>>>>>> in cluster?*
>>>>>>>>
>>>>>>>> Regardless of whether your server data node is "embedded" or not,
>>>>>>>> you can still use a Locator, or mcast to have the node join the cluster.
>>>>>>>> The "embedded" scenario, where the "application" is a GemFire Server data
>>>>>>>> node will be part of the cluster as Udo said.
>>>>>>>>
>>>>>>>> This is easily achievable with...
>>>>>>>>
>>>>>>>> <util:properties id="gemfireProperties">
>>>>>>>>   <prop key="name">Example</prop>
>>>>>>>>   <!-- Set to non-zero value to use Multicast; comment out
>>>>>>>> "locators" -->
>>>>>>>>   <prop key="*mcast-port*">0</prop>
>>>>>>>>   <prop key="log-level">${gemfire.log-level:config}</prop>
>>>>>>>>   <prop key=“*locators*”>someHost[10334]</prop>
>>>>>>>>   <prop key="start-locator">localhost[1034]</prop>
>>>>>>>> </util:properties>
>>>>>>>>
>>>>>>>> <gfe:cache properties-ref="gemfireProperties"/>
>>>>>>>>
>>>>>>>> ...
>>>>>>>>
>>>>>>>>
>>>>>>>> As you can see from the snippet of *Spring* XML config above, this
>>>>>>>> application is a Geode "peer" cache (i.e. embeds a Geode data node/server).
>>>>>>>>
>>>>>>>> The "*locators*" Geode/GemFire property enables this node to
>>>>>>>> connect to a cluster.  Likewise, you can use the "*mcast-port*"
>>>>>>>> property instead, however, I would recommend *Locators* over mcast.
>>>>>>>>
>>>>>>>> Additionally, you can see that I specified the "start-locator"
>>>>>>>> Geode/GemFire property, which enables me to start an embedded Locator.
>>>>>>>> Useful for testing purposes and connecting Geode data nodes together in a
>>>>>>>> cluster without a dedicated Locator, though, this approach is less
>>>>>>>> resilient if the applications/servers go down (as may be the case in a
>>>>>>>> micro-services scenario)!
>>>>>>>>
>>>>>>>>
>>>>>>>> *> if I start with embedded server is it required to use client
>>>>>>>> pool or is it not required?*
>>>>>>>>
>>>>>>>> A "client pool" is only applicable to cache clients (i.e.
>>>>>>>> ClientCaches) on the "client-side" of the equation.  "peers" find
>>>>>>>> (Locator, mcast) and communicate (TCP/UDP, JGroups) with each other through
>>>>>>>> other means once a cluster is formed.
>>>>>>>>
>>>>>>>> In fact, typically, it is more common to position your
>>>>>>>> microservices-based applications as Geode cache clients (i.e. <gfe:client-cache
>>>>>>>> ...>) and have them connect to a dedicated Geode service (i.e.
>>>>>>>> cluster of Geode servers/data nodes where also, 1 or more of those nodes
>>>>>>>> are running a "CacheServer", listening for cache clients to
>>>>>>>> connect).  These dedicated Geode server nodes in a cluster constituting the
>>>>>>>> service can still be configured with *Spring*, but they typically
>>>>>>>> will not contain an application-specific components other than
>>>>>>>> CacheListeners, Loaders, Writers, AEQ *Listeners*, etc.
>>>>>>>>
>>>>>>>> ClientCache applications use 1 or more Pools configured to talk to
>>>>>>>> the servers in the cluster (either by way of Locator or direct server
>>>>>>>> communication). Pools can be configured with groups to target
>>>>>>>> specific members (in that group) in the cluster.  Typically, members in 1
>>>>>>>> group host a different set of Regions from another group and is a way to
>>>>>>>> separate data traffic from 1 client to another dedicated to a specific
>>>>>>>> resource/purpose (usually based on business function, etc).
>>>>>>>>
>>>>>>>> On a side note, some of what you are wanting to do "scale-wise"
>>>>>>>> seems like a perfect fit for Pivotal CloudFoundry, which can auto-scale up
>>>>>>>> or down nodes in your cluster based on load and other factors.
>>>>>>>>
>>>>>>>> Anyway, hope this helps!
>>>>>>>>
>>>>>>>> -John
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> [1] http://docs.spring.io/spring/docs/current/spring-framewo
>>>>>>>> rk-reference/htmlsingle/#beans-factory-extension-bpp
>>>>>>>> [2] http://geode.apache.org/docs/guide/basic_config/the_cach
>>>>>>>> e/setting_cache_initializer.html
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sun, Dec 25, 2016 at 11:12 PM, Amit Pandey <
>>>>>>>> amit.pandey2103@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hey,
>>>>>>>>>
>>>>>>>>> Thanks.
>>>>>>>>>
>>>>>>>>> I have lots of reference data which will be loaded at start of
>>>>>>>>> day. This data is not bound to change much and as such I want to keep it
>>>>>>>>> loaded at the start of day. Read through will make it slow while it is
>>>>>>>>> being actually accessed so I want to keep it loaded in memory.
>>>>>>>>>
>>>>>>>>> Also I want to have functions which will be called by clients to
>>>>>>>>> do some compute and return results. Using functions should allow me to add
>>>>>>>>> nodes and speed up the compute.
>>>>>>>>>
>>>>>>>>> I have some micro services each of which will start a gemfire
>>>>>>>>> node, and I want to connect, so yes I can set it up with locator.
>>>>>>>>>
>>>>>>>>> However I have one doubt, if I start with embedded server is it
>>>>>>>>> required to use client pool or is it not required?
>>>>>>>>>
>>>>>>>>> Regards
>>>>>>>>>
>>>>>>>>> On Mon, Dec 26, 2016 at 1:18 AM, Udo Kohlmeyer <
>>>>>>>>> ukohlmeyer@pivotal.io> wrote:
>>>>>>>>>
>>>>>>>>>> Hi there Amit,
>>>>>>>>>>
>>>>>>>>>> At this stage the only way you could load all data at one go is
>>>>>>>>>> to write a client to connect to the db and load all in. Another approach
>>>>>>>>>> could be to write the same code into a function and invoke the function at
>>>>>>>>>> start up. But in both cases both are manual.
>>>>>>>>>>
>>>>>>>>>> To have geode servers join a cluster, you have 2 ways.
>>>>>>>>>>
>>>>>>>>>>    1. Connecting them up via a locator
>>>>>>>>>>    2. Connecting them up via mcast.
>>>>>>>>>>
>>>>>>>>>> Please be aware the once you connect a server to a cluster, that
>>>>>>>>>> server becomes an integral part of the cluster so adding/removing servers
>>>>>>>>>> from a cluster is not something you'd want to do in a load-based scaling
>>>>>>>>>> model. i.e if the load is high, add a server and if load is low, shut down
>>>>>>>>>> a server.
>>>>>>>>>>
>>>>>>>>>> Just interest sake, what is your use case.
>>>>>>>>>>
>>>>>>>>>> --Udo
>>>>>>>>>>
>>>>>>>>>> On 12/24/16 05:57, Amit Pandey wrote:
>>>>>>>>>>
>>>>>>>>>> Hi Guys,
>>>>>>>>>>
>>>>>>>>>> I am using Spring Data Geode. I have been able to use read and
>>>>>>>>>> write through/ write behind. I want to load all data on cache startup at a
>>>>>>>>>> go.
>>>>>>>>>>
>>>>>>>>>> Secondly my geode server is embedded but I want to allow it join
>>>>>>>>>> to other nodes.  How should I set it up in config to allow it to join other
>>>>>>>>>> nodes in cluster?
>>>>>>>>>>
>>>>>>>>>> Regards
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> -John
>>>>>>>> john.blum10101 (skype)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> -John
>>>>>>> john.blum10101 (skype)
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Luke Shannon | Platform Engineering | Pivotal
>>>> ------------------------------------------------------------
>>>> -------------
>>>>
>>>> Mobile:416-571-9495 <(416)%20571-9495>
>>>> Join the Toronto Pivotal Usergroup: http://www.meetup.c
>>>> om/Toronto-Pivotal-User-Group/
>>>>
>>>
>>>
>>
>>
>> --
>> -John
>> john.blum10101 (skype)
>>
>
>
>
> --
> -John
> john.blum10101 (skype)
>

Re: Load all data from DB on Cache Start

Posted by John Blum <jb...@pivotal.io>.

Amit-

Another thing, a BPP is my recommended way in *Spring* to load data into a
Region after initialization, so I whole heartily support Luke on this.

Also keep in mind, if you need the initial Region load to be done
asynchronously (a BPP callback method is invoked synchronously during a
*Spring* ApplicationContext refresh and will block all other (possible)
beans (coming after) from being initialized), then you are responsible for
making that happen... perhaps with an appropriate Executor and Future.
Keep in mind that you can also publish (fire) an ApplicationEvent to your
"interested" application components (beans) that need to know when the
Region is fully loaded and ready for use.

Additionally, if you do not need to preload your Region on startup, then a
CacheLoader is the recommended way to load data into your Region on cache
misses (another synchronous mechanism called a "read-through").

A word of caution, never, ever auto-wire or inject any beans into a BPP.
To do so could cause premature initialization.  Always rely on the bean
instance passed to the BPPs postProcessXXXX methods.

Thanks,
John


On Sat, Jan 14, 2017 at 1:30 PM, John Blum <jb...@pivotal.io> wrote:

> Hi Amit, Luke-
>
> Thank you Luke.
>
> Actually Luke is mostly correct.  In this case, the order, however, DOES
> NOT matter.  The *Spring* container is intimately aware of certain types
> of beans defined/declared in the *Spring* ApplicationContext.
> BeanPostProcessors, a container extension point (hook), are one of them.
>
> *Spring* creates all BeanPostProcessors (BPP) before any other
> application beans in order to post process each bean defined/declared in
> the container (except for BPPs and BeanFactoryPostProcessors, of
> course).  The container then proceeds to call the BPP *before* the bean
> is "initialized" by the container (i.e. postProcessBeforeInitializatio
> n(..)) as well as *after* the bean has been "initialized".  A bean
> initialization corresponds to InitializingBean.afterPropertiesSet(), any
> init() methods marked as such in XML config or any @PostContruct methods.
>
> Most SDG FactoryBeans (e.g. PartitionedRegionFactoryBean ->
> RegionFactoryBean) always create their GemFire object (e.g. Region) in
> the afterPropertiesSet() (i.e. initialization) method as the
> <SDG>FactoryBean implements *Spring's* InitializingBean (callback)
> interface.
>
> Therefore, technically, it is safe to define/declare any beans, in any
> order, since the dependencies and callbacks (BPP) pretty much determine the
> order in which beans are constructed, configured and initialized.  SDG even
> takes the Spring container DI concept to the level of ensure GemFire
> objects are created in the order that GemFire expects based on both
> explicit and implicit dependencies (think Regions and a DiskStore, for
> instance, where the DS is just named in the Region configuration;
> under-the-hood, though, SDG creates a RuntimeReference on the named DS to
> ensure the proper order).  Another example would be, it is also possible to
> defined/declare your Regions before a the Cache instance...
>
> <gfe:partitioned-region id="Products" ... />
>
> <gfe:cache/>
>
> SDG does not care how your define yours beans generally will do the right
> thing.  Using JavaConfig is a bit different though and in certain cases you
> have be a bit more conscientious of the order.
>
> In general, if you had a container with multiple beans defined/declared
> that had NO dependencies between them (or other pre-defined order
> specified, such as when using *Spring's* @Ordered annotation in an
> AnnotationBasedApplicationContext or by implementing the Ordered
> interface), then *Spring* will pretty much proceed to construct,
> configure and initialize beans in the order they are declared in the
> ApplicationContext config.
>
> Now, if you have multiple BPPs to process the Region, for various reasons,
> then you will need to define order among them by using the @Ordered
> annotation or by having your custom BPP implement the Ordered interface,
> if order is important.  If an order is not given, then *Spring* makes no
> guarantees which BPP will be invoked first.
>
> Anyway, all of this is well-described in the Spring documentation on "*Customizing
> the nature of a bean*" [1] as well as in "Container Extension Points" [2].
>
> Hope this helps.
>
> -John
>
> [1] http://docs.spring.io/spring/docs/current/spring-framework-reference/
> htmlsingle/#beans-factory-nature
> [2] http://docs.spring.io/spring/docs/current/spring-framework-reference/
> htmlsingle/#beans-factory-extension
>
>
> On Sat, Jan 14, 2017 at 8:38 AM, Amit Pandey <am...@gmail.com>
> wrote:
>
>> Okay...yea as post processors process everything in the IOC thats the
>> only way I guess
>>
>> Thanks
>>
>>
>>
>> On Sat, Jan 14, 2017 at 9:36 PM, Luke Shannon <ls...@pivotal.io>
>> wrote:
>>
>>> Hi Amit,
>>>
>>> In the past I have done it like this:
>>>
>>> Define a BeanPostProcessor like below. It will go out and get the data
>>> from where ever it lives, convert it to objects and then put them into the
>>> region using a Region reference passed in shortly after the region is
>>> initialized. This bean will need to be in the class path of Geode when it
>>> start up. If using gfsh you can add it to the '--classpath' argument of the
>>> 'start server' command.
>>>
>>> You can then wire this bean into the Geode Cache xml like so:
>>>
>>> <gfe:replicated-region id="Product" />
>>>
>>> <bean id="productLoader" class="mypackage.ProductLoader">
>>>
>>> <property name="targetBeanName" value="Product" />
>>>
>>> </bean>
>>>
>>> Note that this bean is placed *below* your region definitions in the
>>> spring cache xml. If I remember correctly order matters and it will try and
>>> run this before the Region reference is created if the order is not correct.
>>>
>>> Hope this helps,
>>>
>>> Luke
>>>
>>> import java.io.BufferedReader;
>>> import java.io.File;
>>> import java.io.FileReader;
>>> import java.io.IOException;
>>> import java.util.HashMap;
>>> import java.util.Map;
>>> import org.springframework.beans.BeansException;
>>> import org.springframework.beans.factory.config.BeanPostProcessor;
>>> import org.springframework.util.Assert;
>>> import org.springframework.util.StringUtils;
>>> import com.gemstone.gemfire.cache.Region;
>>> import com.google.gson.Gson;
>>>
>>>
>>> public class ProductLoader implements BeanPostProcessor {
>>>
>>> private String targetBeanName;
>>> protected String getTargetBeanName() {
>>>    Assert.state(StringUtils.hasText(targetBeanName), "The target Spring
>>> context bean name was not properly specified!");
>>>    return targetBeanName;
>>>  }
>>>
>>>  public void setTargetBeanName(final String targetBeanName) {
>>>    Assert.hasText(targetBeanName, "The target Spring context bean name
>>> must be specified!");
>>>    this.targetBeanName = targetBeanName;
>>>  }
>>>
>>>  @Override
>>>  public Object postProcessBeforeInitialization(final Object bean, final
>>> String beanName) throws BeansException {
>>>    return bean;
>>>  }
>>>
>>> @SuppressWarnings({ "unchecked", "rawtypes" })
>>> @Override
>>>  public Object postProcessAfterInitialization(final Object bean, final
>>> String beanName) throws BeansException {
>>>    if (beanName.equals(getTargetBeanName()) && bean instanceof Region) {
>>>           //get your data from where it lives and do a put or a put all
>>> into the region here
>>> ((Region) bean).put(<Key For Product>,<Product Value>);
>>>    log.info("Preloading complete. Region now has: " + ((Region)
>>> bean).size());
>>>    }
>>>    return bean;
>>>  }
>>>
>>>
>>>
>>> }
>>>
>>>
>>> On Sat, Jan 14, 2017 at 10:01 AM, Amit Pandey <amit.pandey2103@gmail.com
>>> > wrote:
>>>
>>>> Hey John,
>>>>
>>>> How do we hook up post processors for a region ?
>>>>
>>>> If I have a region like :-
>>>>
>>>> <gfe:partitioned-region id="trades">
>>>>     <gfe:cache-loader>
>>>>         <bean class="x.y.z.TradeLoader"/>
>>>>     </gfe:cache-loader>
>>>>     <gfe:cache-writer>
>>>>         <bean class="x.y.z.TradeWriter"/>
>>>>     </gfe:cache-writer>
>>>>
>>>>
>>>> </gfe:partitioned-region>
>>>>
>>>>
>>>> How do we hook up the post processor?
>>>>
>>>>
>>>> On Tue, Dec 27, 2016 at 1:22 PM, Amit Pandey <amit.pandey2103@gmail.com
>>>> > wrote:
>>>>
>>>>> Hey,
>>>>>
>>>>> Happy Holidays. Wishing you a great new year :)
>>>>>
>>>>> Regards
>>>>>
>>>>> On Tue, Dec 27, 2016 at 1:08 PM, John Blum <jb...@pivotal.io> wrote:
>>>>>
>>>>>> ;-)  Happy holidays my friend.  Hope your are getting some good R&R.
>>>>>>
>>>>>> On Mon, Dec 26, 2016 at 2:14 PM, Udo Kohlmeyer <ukohlmeyer@pivotal.io
>>>>>> > wrote:
>>>>>>
>>>>>>> it helps a lot! :D
>>>>>>>
>>>>>>> On 12/26/16 12:28, John Blum wrote:
>>>>>>>
>>>>>>> Amit-
>>>>>>>
>>>>>>> Regarding...
>>>>>>>
>>>>>>> *> I want to load all data on cache startup at a go.*
>>>>>>>
>>>>>>> Since you are using "*Spring*", you could easily implement a
>>>>>>> *Spring* BeanPostProcessor [1] (BPP) for each (or all the)
>>>>>>> *Region(s)* in which you need to load data.  I do this frequently
>>>>>>> in *Spring Data GemFire/Geode's* test suite when testing *Region*
>>>>>>> data access operations using the GemfireTemplate, *Repositories* or
>>>>>>> things of that nature.  Clearly your BPP could use a DataSource to
>>>>>>> load the data from an external data store (e.g. RDBMS).
>>>>>>>
>>>>>>> Another way to do load data on startup is to use a Geode
>>>>>>> *Initializer*.  However, this would require you to specify a
>>>>>>> snippet of cache.xml and does not work if you specify your *Regions*
>>>>>>> in *Spring* (XML/Java) config as you should when using *Spring*.  I
>>>>>>> also don't recommend using cache.xml, but is the pure, non-*Spring*
>>>>>>> way to invoke logic after the cache has been "fully" initialized (i.e.
>>>>>>> where the *Regions* have been defined in cache.xml).
>>>>>>>
>>>>>>> See here [2] for more details.  Note, the documentation talks of
>>>>>>> "launching an application" on startup, after cache initialization, but
>>>>>>> technically, you can do whatever you want, like load data.
>>>>>>>
>>>>>>> I recommend the BPP.
>>>>>>>
>>>>>>>
>>>>>>> *> How should I set it up in config to allow it to join other nodes
>>>>>>> in cluster?*
>>>>>>>
>>>>>>> Regardless of whether your server data node is "embedded" or not,
>>>>>>> you can still use a Locator, or mcast to have the node join the cluster.
>>>>>>> The "embedded" scenario, where the "application" is a GemFire Server data
>>>>>>> node will be part of the cluster as Udo said.
>>>>>>>
>>>>>>> This is easily achievable with...
>>>>>>>
>>>>>>> <util:properties id="gemfireProperties">
>>>>>>>   <prop key="name">Example</prop>
>>>>>>>   <!-- Set to non-zero value to use Multicast; comment out
>>>>>>> "locators" -->
>>>>>>>   <prop key="*mcast-port*">0</prop>
>>>>>>>   <prop key="log-level">${gemfire.log-level:config}</prop>
>>>>>>>   <prop key=“*locators*”>someHost[10334]</prop>
>>>>>>>   <prop key="start-locator">localhost[1034]</prop>
>>>>>>> </util:properties>
>>>>>>>
>>>>>>> <gfe:cache properties-ref="gemfireProperties"/>
>>>>>>>
>>>>>>> ...
>>>>>>>
>>>>>>>
>>>>>>> As you can see from the snippet of *Spring* XML config above, this
>>>>>>> application is a Geode "peer" cache (i.e. embeds a Geode data node/server).
>>>>>>>
>>>>>>> The "*locators*" Geode/GemFire property enables this node to
>>>>>>> connect to a cluster.  Likewise, you can use the "*mcast-port*"
>>>>>>> property instead, however, I would recommend *Locators* over mcast.
>>>>>>>
>>>>>>> Additionally, you can see that I specified the "start-locator"
>>>>>>> Geode/GemFire property, which enables me to start an embedded Locator.
>>>>>>> Useful for testing purposes and connecting Geode data nodes together in a
>>>>>>> cluster without a dedicated Locator, though, this approach is less
>>>>>>> resilient if the applications/servers go down (as may be the case in a
>>>>>>> micro-services scenario)!
>>>>>>>
>>>>>>>
>>>>>>> *> if I start with embedded server is it required to use client pool
>>>>>>> or is it not required?*
>>>>>>>
>>>>>>> A "client pool" is only applicable to cache clients (i.e.
>>>>>>> ClientCaches) on the "client-side" of the equation.  "peers" find
>>>>>>> (Locator, mcast) and communicate (TCP/UDP, JGroups) with each other through
>>>>>>> other means once a cluster is formed.
>>>>>>>
>>>>>>> In fact, typically, it is more common to position your
>>>>>>> microservices-based applications as Geode cache clients (i.e. <gfe:client-cache
>>>>>>> ...>) and have them connect to a dedicated Geode service (i.e.
>>>>>>> cluster of Geode servers/data nodes where also, 1 or more of those nodes
>>>>>>> are running a "CacheServer", listening for cache clients to
>>>>>>> connect).  These dedicated Geode server nodes in a cluster constituting the
>>>>>>> service can still be configured with *Spring*, but they typically
>>>>>>> will not contain an application-specific components other than
>>>>>>> CacheListeners, Loaders, Writers, AEQ *Listeners*, etc.
>>>>>>>
>>>>>>> ClientCache applications use 1 or more Pools configured to talk to
>>>>>>> the servers in the cluster (either by way of Locator or direct server
>>>>>>> communication). Pools can be configured with groups to target
>>>>>>> specific members (in that group) in the cluster.  Typically, members in 1
>>>>>>> group host a different set of Regions from another group and is a way to
>>>>>>> separate data traffic from 1 client to another dedicated to a specific
>>>>>>> resource/purpose (usually based on business function, etc).
>>>>>>>
>>>>>>> On a side note, some of what you are wanting to do "scale-wise"
>>>>>>> seems like a perfect fit for Pivotal CloudFoundry, which can auto-scale up
>>>>>>> or down nodes in your cluster based on load and other factors.
>>>>>>>
>>>>>>> Anyway, hope this helps!
>>>>>>>
>>>>>>> -John
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> [1] http://docs.spring.io/spring/docs/current/spring-framewo
>>>>>>> rk-reference/htmlsingle/#beans-factory-extension-bpp
>>>>>>> [2] http://geode.apache.org/docs/guide/basic_config/the_cach
>>>>>>> e/setting_cache_initializer.html
>>>>>>>
>>>>>>>
>>>>>>> On Sun, Dec 25, 2016 at 11:12 PM, Amit Pandey <
>>>>>>> amit.pandey2103@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hey,
>>>>>>>>
>>>>>>>> Thanks.
>>>>>>>>
>>>>>>>> I have lots of reference data which will be loaded at start of day.
>>>>>>>> This data is not bound to change much and as such I want to keep it loaded
>>>>>>>> at the start of day. Read through will make it slow while it is being
>>>>>>>> actually accessed so I want to keep it loaded in memory.
>>>>>>>>
>>>>>>>> Also I want to have functions which will be called by clients to do
>>>>>>>> some compute and return results. Using functions should allow me to add
>>>>>>>> nodes and speed up the compute.
>>>>>>>>
>>>>>>>> I have some micro services each of which will start a gemfire node,
>>>>>>>> and I want to connect, so yes I can set it up with locator.
>>>>>>>>
>>>>>>>> However I have one doubt, if I start with embedded server is it
>>>>>>>> required to use client pool or is it not required?
>>>>>>>>
>>>>>>>> Regards
>>>>>>>>
>>>>>>>> On Mon, Dec 26, 2016 at 1:18 AM, Udo Kohlmeyer <
>>>>>>>> ukohlmeyer@pivotal.io> wrote:
>>>>>>>>
>>>>>>>>> Hi there Amit,
>>>>>>>>>
>>>>>>>>> At this stage the only way you could load all data at one go is to
>>>>>>>>> write a client to connect to the db and load all in. Another approach could
>>>>>>>>> be to write the same code into a function and invoke the function at start
>>>>>>>>> up. But in both cases both are manual.
>>>>>>>>>
>>>>>>>>> To have geode servers join a cluster, you have 2 ways.
>>>>>>>>>
>>>>>>>>>    1. Connecting them up via a locator
>>>>>>>>>    2. Connecting them up via mcast.
>>>>>>>>>
>>>>>>>>> Please be aware the once you connect a server to a cluster, that
>>>>>>>>> server becomes an integral part of the cluster so adding/removing servers
>>>>>>>>> from a cluster is not something you'd want to do in a load-based scaling
>>>>>>>>> model. i.e if the load is high, add a server and if load is low, shut down
>>>>>>>>> a server.
>>>>>>>>>
>>>>>>>>> Just interest sake, what is your use case.
>>>>>>>>>
>>>>>>>>> --Udo
>>>>>>>>>
>>>>>>>>> On 12/24/16 05:57, Amit Pandey wrote:
>>>>>>>>>
>>>>>>>>> Hi Guys,
>>>>>>>>>
>>>>>>>>> I am using Spring Data Geode. I have been able to use read and
>>>>>>>>> write through/ write behind. I want to load all data on cache startup at a
>>>>>>>>> go.
>>>>>>>>>
>>>>>>>>> Secondly my geode server is embedded but I want to allow it join
>>>>>>>>> to other nodes.  How should I set it up in config to allow it to join other
>>>>>>>>> nodes in cluster?
>>>>>>>>>
>>>>>>>>> Regards
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> -John
>>>>>>> john.blum10101 (skype)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> -John
>>>>>> john.blum10101 (skype)
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Luke Shannon | Platform Engineering | Pivotal
>>> ------------------------------------------------------------
>>> -------------
>>>
>>> Mobile:416-571-9495 <(416)%20571-9495>
>>> Join the Toronto Pivotal Usergroup: http://www.meetup.c
>>> om/Toronto-Pivotal-User-Group/
>>>
>>
>>
>
>
> --
> -John
> john.blum10101 (skype)
>



-- 
-John
john.blum10101 (skype)

Re: Load all data from DB on Cache Start

Posted by John Blum <jb...@pivotal.io>.

Hi Amit, Luke-

Thank you Luke.

Actually Luke is mostly correct.  In this case, the order, however, DOES
NOT matter.  The *Spring* container is intimately aware of certain types of
beans defined/declared in the *Spring* ApplicationContext.
BeanPostProcessors, a container extension point (hook), are one of them.

*Spring* creates all BeanPostProcessors (BPP) before any other application
beans in order to post process each bean defined/declared in the container
(except for BPPs and BeanFactoryPostProcessors, of course).  The container
then proceeds to call the BPP *before* the bean is "initialized" by the
container (i.e. postProcessBeforeInitialization(..)) as well as *after* the
bean has been "initialized".  A bean initialization corresponds to
InitializingBean.afterPropertiesSet(), any init() methods marked as such in
XML config or any @PostContruct methods.

Most SDG FactoryBeans (e.g. PartitionedRegionFactoryBean ->
RegionFactoryBean) always create their GemFire object (e.g. Region) in the
afterPropertiesSet() (i.e. initialization) method as the <SDG>FactoryBean
implements *Spring's* InitializingBean (callback) interface.

Therefore, technically, it is safe to define/declare any beans, in any
order, since the dependencies and callbacks (BPP) pretty much determine the
order in which beans are constructed, configured and initialized.  SDG even
takes the Spring container DI concept to the level of ensure GemFire
objects are created in the order that GemFire expects based on both
explicit and implicit dependencies (think Regions and a DiskStore, for
instance, where the DS is just named in the Region configuration;
under-the-hood, though, SDG creates a RuntimeReference on the named DS to
ensure the proper order).  Another example would be, it is also possible to
defined/declare your Regions before a the Cache instance...

<gfe:partitioned-region id="Products" ... />

<gfe:cache/>

SDG does not care how your define yours beans generally will do the right
thing.  Using JavaConfig is a bit different though and in certain cases you
have be a bit more conscientious of the order.

In general, if you had a container with multiple beans defined/declared
that had NO dependencies between them (or other pre-defined order
specified, such as when using *Spring's* @Ordered annotation in an
AnnotationBasedApplicationContext or by implementing the Ordered
interface), then *Spring* will pretty much proceed to construct, configure
and initialize beans in the order they are declared in the
ApplicationContext config.

Now, if you have multiple BPPs to process the Region, for various reasons,
then you will need to define order among them by using the @Ordered
annotation or by having your custom BPP implement the Ordered interface, if
order is important.  If an order is not given, then *Spring* makes no
guarantees which BPP will be invoked first.

Anyway, all of this is well-described in the Spring documentation on
"*Customizing
the nature of a bean*" [1] as well as in "Container Extension Points" [2].

Hope this helps.

-John

[1]
http://docs.spring.io/spring/docs/current/spring-framework-reference/htmlsingle/#beans-factory-nature
[2]
http://docs.spring.io/spring/docs/current/spring-framework-reference/htmlsingle/#beans-factory-extension


On Sat, Jan 14, 2017 at 8:38 AM, Amit Pandey <am...@gmail.com>
wrote:

> Okay...yea as post processors process everything in the IOC thats the only
> way I guess
>
> Thanks
>
>
>
> On Sat, Jan 14, 2017 at 9:36 PM, Luke Shannon <ls...@pivotal.io> wrote:
>
>> Hi Amit,
>>
>> In the past I have done it like this:
>>
>> Define a BeanPostProcessor like below. It will go out and get the data
>> from where ever it lives, convert it to objects and then put them into the
>> region using a Region reference passed in shortly after the region is
>> initialized. This bean will need to be in the class path of Geode when it
>> start up. If using gfsh you can add it to the '--classpath' argument of the
>> 'start server' command.
>>
>> You can then wire this bean into the Geode Cache xml like so:
>>
>> <gfe:replicated-region id="Product" />
>>
>> <bean id="productLoader" class="mypackage.ProductLoader">
>>
>> <property name="targetBeanName" value="Product" />
>>
>> </bean>
>>
>> Note that this bean is placed *below* your region definitions in the
>> spring cache xml. If I remember correctly order matters and it will try and
>> run this before the Region reference is created if the order is not correct.
>>
>> Hope this helps,
>>
>> Luke
>>
>> import java.io.BufferedReader;
>> import java.io.File;
>> import java.io.FileReader;
>> import java.io.IOException;
>> import java.util.HashMap;
>> import java.util.Map;
>> import org.springframework.beans.BeansException;
>> import org.springframework.beans.factory.config.BeanPostProcessor;
>> import org.springframework.util.Assert;
>> import org.springframework.util.StringUtils;
>> import com.gemstone.gemfire.cache.Region;
>> import com.google.gson.Gson;
>>
>>
>> public class ProductLoader implements BeanPostProcessor {
>>
>> private String targetBeanName;
>> protected String getTargetBeanName() {
>>    Assert.state(StringUtils.hasText(targetBeanName), "The target Spring
>> context bean name was not properly specified!");
>>    return targetBeanName;
>>  }
>>
>>  public void setTargetBeanName(final String targetBeanName) {
>>    Assert.hasText(targetBeanName, "The target Spring context bean name
>> must be specified!");
>>    this.targetBeanName = targetBeanName;
>>  }
>>
>>  @Override
>>  public Object postProcessBeforeInitialization(final Object bean, final
>> String beanName) throws BeansException {
>>    return bean;
>>  }
>>
>> @SuppressWarnings({ "unchecked", "rawtypes" })
>> @Override
>>  public Object postProcessAfterInitialization(final Object bean, final
>> String beanName) throws BeansException {
>>    if (beanName.equals(getTargetBeanName()) && bean instanceof Region) {
>>           //get your data from where it lives and do a put or a put all
>> into the region here
>> ((Region) bean).put(<Key For Product>,<Product Value>);
>>    log.info("Preloading complete. Region now has: " + ((Region)
>> bean).size());
>>    }
>>    return bean;
>>  }
>>
>>
>>
>> }
>>
>>
>> On Sat, Jan 14, 2017 at 10:01 AM, Amit Pandey <am...@gmail.com>
>> wrote:
>>
>>> Hey John,
>>>
>>> How do we hook up post processors for a region ?
>>>
>>> If I have a region like :-
>>>
>>> <gfe:partitioned-region id="trades">
>>>     <gfe:cache-loader>
>>>         <bean class="x.y.z.TradeLoader"/>
>>>     </gfe:cache-loader>
>>>     <gfe:cache-writer>
>>>         <bean class="x.y.z.TradeWriter"/>
>>>     </gfe:cache-writer>
>>>
>>>
>>> </gfe:partitioned-region>
>>>
>>>
>>> How do we hook up the post processor?
>>>
>>>
>>> On Tue, Dec 27, 2016 at 1:22 PM, Amit Pandey <am...@gmail.com>
>>> wrote:
>>>
>>>> Hey,
>>>>
>>>> Happy Holidays. Wishing you a great new year :)
>>>>
>>>> Regards
>>>>
>>>> On Tue, Dec 27, 2016 at 1:08 PM, John Blum <jb...@pivotal.io> wrote:
>>>>
>>>>> ;-)  Happy holidays my friend.  Hope your are getting some good R&R.
>>>>>
>>>>> On Mon, Dec 26, 2016 at 2:14 PM, Udo Kohlmeyer <uk...@pivotal.io>
>>>>> wrote:
>>>>>
>>>>>> it helps a lot! :D
>>>>>>
>>>>>> On 12/26/16 12:28, John Blum wrote:
>>>>>>
>>>>>> Amit-
>>>>>>
>>>>>> Regarding...
>>>>>>
>>>>>> *> I want to load all data on cache startup at a go.*
>>>>>>
>>>>>> Since you are using "*Spring*", you could easily implement a *Spring*
>>>>>>  BeanPostProcessor [1] (BPP) for each (or all the) *Region(s)* in
>>>>>> which you need to load data.  I do this frequently in *Spring Data
>>>>>> GemFire/Geode's* test suite when testing *Region* data access
>>>>>> operations using the GemfireTemplate, *Repositories* or things of
>>>>>> that nature.  Clearly your BPP could use a DataSource to load the
>>>>>> data from an external data store (e.g. RDBMS).
>>>>>>
>>>>>> Another way to do load data on startup is to use a Geode
>>>>>> *Initializer*.  However, this would require you to specify a snippet
>>>>>> of cache.xml and does not work if you specify your *Regions* in
>>>>>> *Spring* (XML/Java) config as you should when using *Spring*.  I
>>>>>> also don't recommend using cache.xml, but is the pure, non-*Spring*
>>>>>> way to invoke logic after the cache has been "fully" initialized (i.e.
>>>>>> where the *Regions* have been defined in cache.xml).
>>>>>>
>>>>>> See here [2] for more details.  Note, the documentation talks of
>>>>>> "launching an application" on startup, after cache initialization, but
>>>>>> technically, you can do whatever you want, like load data.
>>>>>>
>>>>>> I recommend the BPP.
>>>>>>
>>>>>>
>>>>>> *> How should I set it up in config to allow it to join other nodes
>>>>>> in cluster?*
>>>>>>
>>>>>> Regardless of whether your server data node is "embedded" or not, you
>>>>>> can still use a Locator, or mcast to have the node join the cluster.  The
>>>>>> "embedded" scenario, where the "application" is a GemFire Server data node
>>>>>> will be part of the cluster as Udo said.
>>>>>>
>>>>>> This is easily achievable with...
>>>>>>
>>>>>> <util:properties id="gemfireProperties">
>>>>>>   <prop key="name">Example</prop>
>>>>>>   <!-- Set to non-zero value to use Multicast; comment out
>>>>>> "locators" -->
>>>>>>   <prop key="*mcast-port*">0</prop>
>>>>>>   <prop key="log-level">${gemfire.log-level:config}</prop>
>>>>>>   <prop key=“*locators*”>someHost[10334]</prop>
>>>>>>   <prop key="start-locator">localhost[1034]</prop>
>>>>>> </util:properties>
>>>>>>
>>>>>> <gfe:cache properties-ref="gemfireProperties"/>
>>>>>>
>>>>>> ...
>>>>>>
>>>>>>
>>>>>> As you can see from the snippet of *Spring* XML config above, this
>>>>>> application is a Geode "peer" cache (i.e. embeds a Geode data node/server).
>>>>>>
>>>>>> The "*locators*" Geode/GemFire property enables this node to connect
>>>>>> to a cluster.  Likewise, you can use the "*mcast-port*" property
>>>>>> instead, however, I would recommend *Locators* over mcast.
>>>>>>
>>>>>> Additionally, you can see that I specified the "start-locator"
>>>>>> Geode/GemFire property, which enables me to start an embedded Locator.
>>>>>> Useful for testing purposes and connecting Geode data nodes together in a
>>>>>> cluster without a dedicated Locator, though, this approach is less
>>>>>> resilient if the applications/servers go down (as may be the case in a
>>>>>> micro-services scenario)!
>>>>>>
>>>>>>
>>>>>> *> if I start with embedded server is it required to use client pool
>>>>>> or is it not required?*
>>>>>>
>>>>>> A "client pool" is only applicable to cache clients (i.e.
>>>>>> ClientCaches) on the "client-side" of the equation.  "peers" find
>>>>>> (Locator, mcast) and communicate (TCP/UDP, JGroups) with each other through
>>>>>> other means once a cluster is formed.
>>>>>>
>>>>>> In fact, typically, it is more common to position your
>>>>>> microservices-based applications as Geode cache clients (i.e. <gfe:client-cache
>>>>>> ...>) and have them connect to a dedicated Geode service (i.e.
>>>>>> cluster of Geode servers/data nodes where also, 1 or more of those nodes
>>>>>> are running a "CacheServer", listening for cache clients to
>>>>>> connect).  These dedicated Geode server nodes in a cluster constituting the
>>>>>> service can still be configured with *Spring*, but they typically
>>>>>> will not contain an application-specific components other than
>>>>>> CacheListeners, Loaders, Writers, AEQ *Listeners*, etc.
>>>>>>
>>>>>> ClientCache applications use 1 or more Pools configured to talk to
>>>>>> the servers in the cluster (either by way of Locator or direct server
>>>>>> communication). Pools can be configured with groups to target
>>>>>> specific members (in that group) in the cluster.  Typically, members in 1
>>>>>> group host a different set of Regions from another group and is a way to
>>>>>> separate data traffic from 1 client to another dedicated to a specific
>>>>>> resource/purpose (usually based on business function, etc).
>>>>>>
>>>>>> On a side note, some of what you are wanting to do "scale-wise" seems
>>>>>> like a perfect fit for Pivotal CloudFoundry, which can auto-scale up or
>>>>>> down nodes in your cluster based on load and other factors.
>>>>>>
>>>>>> Anyway, hope this helps!
>>>>>>
>>>>>> -John
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> [1] http://docs.spring.io/spring/docs/current/spring-framewo
>>>>>> rk-reference/htmlsingle/#beans-factory-extension-bpp
>>>>>> [2] http://geode.apache.org/docs/guide/basic_config/the_cach
>>>>>> e/setting_cache_initializer.html
>>>>>>
>>>>>>
>>>>>> On Sun, Dec 25, 2016 at 11:12 PM, Amit Pandey <
>>>>>> amit.pandey2103@gmail.com> wrote:
>>>>>>
>>>>>>> Hey,
>>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>> I have lots of reference data which will be loaded at start of day.
>>>>>>> This data is not bound to change much and as such I want to keep it loaded
>>>>>>> at the start of day. Read through will make it slow while it is being
>>>>>>> actually accessed so I want to keep it loaded in memory.
>>>>>>>
>>>>>>> Also I want to have functions which will be called by clients to do
>>>>>>> some compute and return results. Using functions should allow me to add
>>>>>>> nodes and speed up the compute.
>>>>>>>
>>>>>>> I have some micro services each of which will start a gemfire node,
>>>>>>> and I want to connect, so yes I can set it up with locator.
>>>>>>>
>>>>>>> However I have one doubt, if I start with embedded server is it
>>>>>>> required to use client pool or is it not required?
>>>>>>>
>>>>>>> Regards
>>>>>>>
>>>>>>> On Mon, Dec 26, 2016 at 1:18 AM, Udo Kohlmeyer <
>>>>>>> ukohlmeyer@pivotal.io> wrote:
>>>>>>>
>>>>>>>> Hi there Amit,
>>>>>>>>
>>>>>>>> At this stage the only way you could load all data at one go is to
>>>>>>>> write a client to connect to the db and load all in. Another approach could
>>>>>>>> be to write the same code into a function and invoke the function at start
>>>>>>>> up. But in both cases both are manual.
>>>>>>>>
>>>>>>>> To have geode servers join a cluster, you have 2 ways.
>>>>>>>>
>>>>>>>>    1. Connecting them up via a locator
>>>>>>>>    2. Connecting them up via mcast.
>>>>>>>>
>>>>>>>> Please be aware the once you connect a server to a cluster, that
>>>>>>>> server becomes an integral part of the cluster so adding/removing servers
>>>>>>>> from a cluster is not something you'd want to do in a load-based scaling
>>>>>>>> model. i.e if the load is high, add a server and if load is low, shut down
>>>>>>>> a server.
>>>>>>>>
>>>>>>>> Just interest sake, what is your use case.
>>>>>>>>
>>>>>>>> --Udo
>>>>>>>>
>>>>>>>> On 12/24/16 05:57, Amit Pandey wrote:
>>>>>>>>
>>>>>>>> Hi Guys,
>>>>>>>>
>>>>>>>> I am using Spring Data Geode. I have been able to use read and
>>>>>>>> write through/ write behind. I want to load all data on cache startup at a
>>>>>>>> go.
>>>>>>>>
>>>>>>>> Secondly my geode server is embedded but I want to allow it join to
>>>>>>>> other nodes.  How should I set it up in config to allow it to join other
>>>>>>>> nodes in cluster?
>>>>>>>>
>>>>>>>> Regards
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> -John
>>>>>> john.blum10101 (skype)
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> -John
>>>>> john.blum10101 (skype)
>>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> Luke Shannon | Platform Engineering | Pivotal
>> ------------------------------------------------------------
>> -------------
>>
>> Mobile:416-571-9495 <(416)%20571-9495>
>> Join the Toronto Pivotal Usergroup: http://www.meetup.c
>> om/Toronto-Pivotal-User-Group/
>>
>
>


-- 
-John
john.blum10101 (skype)

Re: Load all data from DB on Cache Start

Posted by Amit Pandey <am...@gmail.com>.

Okay...yea as post processors process everything in the IOC thats the only
way I guess

Thanks



On Sat, Jan 14, 2017 at 9:36 PM, Luke Shannon <ls...@pivotal.io> wrote:

> Hi Amit,
>
> In the past I have done it like this:
>
> Define a BeanPostProcessor like below. It will go out and get the data
> from where ever it lives, convert it to objects and then put them into the
> region using a Region reference passed in shortly after the region is
> initialized. This bean will need to be in the class path of Geode when it
> start up. If using gfsh you can add it to the '--classpath' argument of the
> 'start server' command.
>
> You can then wire this bean into the Geode Cache xml like so:
>
> <gfe:replicated-region id="Product" />
>
> <bean id="productLoader" class="mypackage.ProductLoader">
>
> <property name="targetBeanName" value="Product" />
>
> </bean>
>
> Note that this bean is placed *below* your region definitions in the
> spring cache xml. If I remember correctly order matters and it will try and
> run this before the Region reference is created if the order is not correct.
>
> Hope this helps,
>
> Luke
>
> import java.io.BufferedReader;
> import java.io.File;
> import java.io.FileReader;
> import java.io.IOException;
> import java.util.HashMap;
> import java.util.Map;
> import org.springframework.beans.BeansException;
> import org.springframework.beans.factory.config.BeanPostProcessor;
> import org.springframework.util.Assert;
> import org.springframework.util.StringUtils;
> import com.gemstone.gemfire.cache.Region;
> import com.google.gson.Gson;
>
>
> public class ProductLoader implements BeanPostProcessor {
>
> private String targetBeanName;
> protected String getTargetBeanName() {
>    Assert.state(StringUtils.hasText(targetBeanName), "The target Spring
> context bean name was not properly specified!");
>    return targetBeanName;
>  }
>
>  public void setTargetBeanName(final String targetBeanName) {
>    Assert.hasText(targetBeanName, "The target Spring context bean name
> must be specified!");
>    this.targetBeanName = targetBeanName;
>  }
>
>  @Override
>  public Object postProcessBeforeInitialization(final Object bean, final
> String beanName) throws BeansException {
>    return bean;
>  }
>
> @SuppressWarnings({ "unchecked", "rawtypes" })
> @Override
>  public Object postProcessAfterInitialization(final Object bean, final
> String beanName) throws BeansException {
>    if (beanName.equals(getTargetBeanName()) && bean instanceof Region) {
>           //get your data from where it lives and do a put or a put all
> into the region here
> ((Region) bean).put(<Key For Product>,<Product Value>);
>    log.info("Preloading complete. Region now has: " + ((Region)
> bean).size());
>    }
>    return bean;
>  }
>
>
>
> }
>
>
> On Sat, Jan 14, 2017 at 10:01 AM, Amit Pandey <am...@gmail.com>
> wrote:
>
>> Hey John,
>>
>> How do we hook up post processors for a region ?
>>
>> If I have a region like :-
>>
>> <gfe:partitioned-region id="trades">
>>     <gfe:cache-loader>
>>         <bean class="x.y.z.TradeLoader"/>
>>     </gfe:cache-loader>
>>     <gfe:cache-writer>
>>         <bean class="x.y.z.TradeWriter"/>
>>     </gfe:cache-writer>
>>
>>
>> </gfe:partitioned-region>
>>
>>
>> How do we hook up the post processor?
>>
>>
>> On Tue, Dec 27, 2016 at 1:22 PM, Amit Pandey <am...@gmail.com>
>> wrote:
>>
>>> Hey,
>>>
>>> Happy Holidays. Wishing you a great new year :)
>>>
>>> Regards
>>>
>>> On Tue, Dec 27, 2016 at 1:08 PM, John Blum <jb...@pivotal.io> wrote:
>>>
>>>> ;-)  Happy holidays my friend.  Hope your are getting some good R&R.
>>>>
>>>> On Mon, Dec 26, 2016 at 2:14 PM, Udo Kohlmeyer <uk...@pivotal.io>
>>>> wrote:
>>>>
>>>>> it helps a lot! :D
>>>>>
>>>>> On 12/26/16 12:28, John Blum wrote:
>>>>>
>>>>> Amit-
>>>>>
>>>>> Regarding...
>>>>>
>>>>> *> I want to load all data on cache startup at a go.*
>>>>>
>>>>> Since you are using "*Spring*", you could easily implement a *Spring*
>>>>> BeanPostProcessor [1] (BPP) for each (or all the) *Region(s)* in
>>>>> which you need to load data.  I do this frequently in *Spring Data
>>>>> GemFire/Geode's* test suite when testing *Region* data access
>>>>> operations using the GemfireTemplate, *Repositories* or things of
>>>>> that nature.  Clearly your BPP could use a DataSource to load the
>>>>> data from an external data store (e.g. RDBMS).
>>>>>
>>>>> Another way to do load data on startup is to use a Geode *Initializer*.
>>>>> However, this would require you to specify a snippet of cache.xml and
>>>>> does not work if you specify your *Regions* in *Spring* (XML/Java)
>>>>> config as you should when using *Spring*.  I also don't recommend
>>>>> using cache.xml, but is the pure, non-*Spring* way to invoke logic
>>>>> after the cache has been "fully" initialized (i.e. where the *Regions* have
>>>>> been defined in cache.xml).
>>>>>
>>>>> See here [2] for more details.  Note, the documentation talks of
>>>>> "launching an application" on startup, after cache initialization, but
>>>>> technically, you can do whatever you want, like load data.
>>>>>
>>>>> I recommend the BPP.
>>>>>
>>>>>
>>>>> *> How should I set it up in config to allow it to join other nodes in
>>>>> cluster?*
>>>>>
>>>>> Regardless of whether your server data node is "embedded" or not, you
>>>>> can still use a Locator, or mcast to have the node join the cluster.  The
>>>>> "embedded" scenario, where the "application" is a GemFire Server data node
>>>>> will be part of the cluster as Udo said.
>>>>>
>>>>> This is easily achievable with...
>>>>>
>>>>> <util:properties id="gemfireProperties">
>>>>>   <prop key="name">Example</prop>
>>>>>   <!-- Set to non-zero value to use Multicast; comment out "locators"
>>>>> -->
>>>>>   <prop key="*mcast-port*">0</prop>
>>>>>   <prop key="log-level">${gemfire.log-level:config}</prop>
>>>>>   <prop key=“*locators*”>someHost[10334]</prop>
>>>>>   <prop key="start-locator">localhost[1034]</prop>
>>>>> </util:properties>
>>>>>
>>>>> <gfe:cache properties-ref="gemfireProperties"/>
>>>>>
>>>>> ...
>>>>>
>>>>>
>>>>> As you can see from the snippet of *Spring* XML config above, this
>>>>> application is a Geode "peer" cache (i.e. embeds a Geode data node/server).
>>>>>
>>>>> The "*locators*" Geode/GemFire property enables this node to connect
>>>>> to a cluster.  Likewise, you can use the "*mcast-port*" property
>>>>> instead, however, I would recommend *Locators* over mcast.
>>>>>
>>>>> Additionally, you can see that I specified the "start-locator"
>>>>> Geode/GemFire property, which enables me to start an embedded Locator.
>>>>> Useful for testing purposes and connecting Geode data nodes together in a
>>>>> cluster without a dedicated Locator, though, this approach is less
>>>>> resilient if the applications/servers go down (as may be the case in a
>>>>> micro-services scenario)!
>>>>>
>>>>>
>>>>> *> if I start with embedded server is it required to use client pool
>>>>> or is it not required?*
>>>>>
>>>>> A "client pool" is only applicable to cache clients (i.e. ClientCaches)
>>>>> on the "client-side" of the equation.  "peers" find (Locator, mcast) and
>>>>> communicate (TCP/UDP, JGroups) with each other through other means once a
>>>>> cluster is formed.
>>>>>
>>>>> In fact, typically, it is more common to position your
>>>>> microservices-based applications as Geode cache clients (i.e. <gfe:client-cache
>>>>> ...>) and have them connect to a dedicated Geode service (i.e.
>>>>> cluster of Geode servers/data nodes where also, 1 or more of those nodes
>>>>> are running a "CacheServer", listening for cache clients to
>>>>> connect).  These dedicated Geode server nodes in a cluster constituting the
>>>>> service can still be configured with *Spring*, but they typically
>>>>> will not contain an application-specific components other than
>>>>> CacheListeners, Loaders, Writers, AEQ *Listeners*, etc.
>>>>>
>>>>> ClientCache applications use 1 or more Pools configured to talk to
>>>>> the servers in the cluster (either by way of Locator or direct server
>>>>> communication). Pools can be configured with groups to target
>>>>> specific members (in that group) in the cluster.  Typically, members in 1
>>>>> group host a different set of Regions from another group and is a way to
>>>>> separate data traffic from 1 client to another dedicated to a specific
>>>>> resource/purpose (usually based on business function, etc).
>>>>>
>>>>> On a side note, some of what you are wanting to do "scale-wise" seems
>>>>> like a perfect fit for Pivotal CloudFoundry, which can auto-scale up or
>>>>> down nodes in your cluster based on load and other factors.
>>>>>
>>>>> Anyway, hope this helps!
>>>>>
>>>>> -John
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> [1] http://docs.spring.io/spring/docs/current/spring-framewo
>>>>> rk-reference/htmlsingle/#beans-factory-extension-bpp
>>>>> [2] http://geode.apache.org/docs/guide/basic_config/the_cach
>>>>> e/setting_cache_initializer.html
>>>>>
>>>>>
>>>>> On Sun, Dec 25, 2016 at 11:12 PM, Amit Pandey <
>>>>> amit.pandey2103@gmail.com> wrote:
>>>>>
>>>>>> Hey,
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>> I have lots of reference data which will be loaded at start of day.
>>>>>> This data is not bound to change much and as such I want to keep it loaded
>>>>>> at the start of day. Read through will make it slow while it is being
>>>>>> actually accessed so I want to keep it loaded in memory.
>>>>>>
>>>>>> Also I want to have functions which will be called by clients to do
>>>>>> some compute and return results. Using functions should allow me to add
>>>>>> nodes and speed up the compute.
>>>>>>
>>>>>> I have some micro services each of which will start a gemfire node,
>>>>>> and I want to connect, so yes I can set it up with locator.
>>>>>>
>>>>>> However I have one doubt, if I start with embedded server is it
>>>>>> required to use client pool or is it not required?
>>>>>>
>>>>>> Regards
>>>>>>
>>>>>> On Mon, Dec 26, 2016 at 1:18 AM, Udo Kohlmeyer <ukohlmeyer@pivotal.io
>>>>>> > wrote:
>>>>>>
>>>>>>> Hi there Amit,
>>>>>>>
>>>>>>> At this stage the only way you could load all data at one go is to
>>>>>>> write a client to connect to the db and load all in. Another approach could
>>>>>>> be to write the same code into a function and invoke the function at start
>>>>>>> up. But in both cases both are manual.
>>>>>>>
>>>>>>> To have geode servers join a cluster, you have 2 ways.
>>>>>>>
>>>>>>>    1. Connecting them up via a locator
>>>>>>>    2. Connecting them up via mcast.
>>>>>>>
>>>>>>> Please be aware the once you connect a server to a cluster, that
>>>>>>> server becomes an integral part of the cluster so adding/removing servers
>>>>>>> from a cluster is not something you'd want to do in a load-based scaling
>>>>>>> model. i.e if the load is high, add a server and if load is low, shut down
>>>>>>> a server.
>>>>>>>
>>>>>>> Just interest sake, what is your use case.
>>>>>>>
>>>>>>> --Udo
>>>>>>>
>>>>>>> On 12/24/16 05:57, Amit Pandey wrote:
>>>>>>>
>>>>>>> Hi Guys,
>>>>>>>
>>>>>>> I am using Spring Data Geode. I have been able to use read and write
>>>>>>> through/ write behind. I want to load all data on cache startup at a go.
>>>>>>>
>>>>>>> Secondly my geode server is embedded but I want to allow it join to
>>>>>>> other nodes.  How should I set it up in config to allow it to join other
>>>>>>> nodes in cluster?
>>>>>>>
>>>>>>> Regards
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> -John
>>>>> john.blum10101 (skype)
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> -John
>>>> john.blum10101 (skype)
>>>>
>>>
>>>
>>
>
>
> --
> Luke Shannon | Platform Engineering | Pivotal
> -------------------------------------------------------------------------
>
> Mobile:416-571-9495
> Join the Toronto Pivotal Usergroup: http://www.meetup.
> com/Toronto-Pivotal-User-Group/
>

Re: Load all data from DB on Cache Start

Posted by Luke Shannon <ls...@pivotal.io>.

Hi Amit,

In the past I have done it like this:

Define a BeanPostProcessor like below. It will go out and get the data from
where ever it lives, convert it to objects and then put them into the
region using a Region reference passed in shortly after the region is
initialized. This bean will need to be in the class path of Geode when it
start up. If using gfsh you can add it to the '--classpath' argument of the
'start server' command.

You can then wire this bean into the Geode Cache xml like so:

<gfe:replicated-region id="Product" />

<bean id="productLoader" class="mypackage.ProductLoader">

<property name="targetBeanName" value="Product" />

</bean>

Note that this bean is placed *below* your region definitions in the spring
cache xml. If I remember correctly order matters and it will try and run
this before the Region reference is created if the order is not correct.

Hope this helps,

Luke

import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.util.HashMap;
import java.util.Map;
import org.springframework.beans.BeansException;
import org.springframework.beans.factory.config.BeanPostProcessor;
import org.springframework.util.Assert;
import org.springframework.util.StringUtils;
import com.gemstone.gemfire.cache.Region;
import com.google.gson.Gson;


public class ProductLoader implements BeanPostProcessor {

private String targetBeanName;
protected String getTargetBeanName() {
   Assert.state(StringUtils.hasText(targetBeanName), "The target Spring
context bean name was not properly specified!");
   return targetBeanName;
 }

 public void setTargetBeanName(final String targetBeanName) {
   Assert.hasText(targetBeanName, "The target Spring context bean name must
be specified!");
   this.targetBeanName = targetBeanName;
 }

 @Override
 public Object postProcessBeforeInitialization(final Object bean, final
String beanName) throws BeansException {
   return bean;
 }

@SuppressWarnings({ "unchecked", "rawtypes" })
@Override
 public Object postProcessAfterInitialization(final Object bean, final
String beanName) throws BeansException {
   if (beanName.equals(getTargetBeanName()) && bean instanceof Region) {
          //get your data from where it lives and do a put or a put all
into the region here
((Region) bean).put(<Key For Product>,<Product Value>);
   log.info("Preloading complete. Region now has: " + ((Region)
bean).size());
   }
   return bean;
 }



}


On Sat, Jan 14, 2017 at 10:01 AM, Amit Pandey <am...@gmail.com>
wrote:

> Hey John,
>
> How do we hook up post processors for a region ?
>
> If I have a region like :-
>
> <gfe:partitioned-region id="trades">
>     <gfe:cache-loader>
>         <bean class="x.y.z.TradeLoader"/>
>     </gfe:cache-loader>
>     <gfe:cache-writer>
>         <bean class="x.y.z.TradeWriter"/>
>     </gfe:cache-writer>
>
>
> </gfe:partitioned-region>
>
>
> How do we hook up the post processor?
>
>
> On Tue, Dec 27, 2016 at 1:22 PM, Amit Pandey <am...@gmail.com>
> wrote:
>
>> Hey,
>>
>> Happy Holidays. Wishing you a great new year :)
>>
>> Regards
>>
>> On Tue, Dec 27, 2016 at 1:08 PM, John Blum <jb...@pivotal.io> wrote:
>>
>>> ;-)  Happy holidays my friend.  Hope your are getting some good R&R.
>>>
>>> On Mon, Dec 26, 2016 at 2:14 PM, Udo Kohlmeyer <uk...@pivotal.io>
>>> wrote:
>>>
>>>> it helps a lot! :D
>>>>
>>>> On 12/26/16 12:28, John Blum wrote:
>>>>
>>>> Amit-
>>>>
>>>> Regarding...
>>>>
>>>> *> I want to load all data on cache startup at a go.*
>>>>
>>>> Since you are using "*Spring*", you could easily implement a *Spring*
>>>> BeanPostProcessor [1] (BPP) for each (or all the) *Region(s)* in which
>>>> you need to load data.  I do this frequently in *Spring Data
>>>> GemFire/Geode's* test suite when testing *Region* data access
>>>> operations using the GemfireTemplate, *Repositories* or things of that
>>>> nature.  Clearly your BPP could use a DataSource to load the data from
>>>> an external data store (e.g. RDBMS).
>>>>
>>>> Another way to do load data on startup is to use a Geode *Initializer*.
>>>> However, this would require you to specify a snippet of cache.xml and
>>>> does not work if you specify your *Regions* in *Spring* (XML/Java)
>>>> config as you should when using *Spring*.  I also don't recommend
>>>> using cache.xml, but is the pure, non-*Spring* way to invoke logic
>>>> after the cache has been "fully" initialized (i.e. where the *Regions* have
>>>> been defined in cache.xml).
>>>>
>>>> See here [2] for more details.  Note, the documentation talks of
>>>> "launching an application" on startup, after cache initialization, but
>>>> technically, you can do whatever you want, like load data.
>>>>
>>>> I recommend the BPP.
>>>>
>>>>
>>>> *> How should I set it up in config to allow it to join other nodes in
>>>> cluster?*
>>>>
>>>> Regardless of whether your server data node is "embedded" or not, you
>>>> can still use a Locator, or mcast to have the node join the cluster.  The
>>>> "embedded" scenario, where the "application" is a GemFire Server data node
>>>> will be part of the cluster as Udo said.
>>>>
>>>> This is easily achievable with...
>>>>
>>>> <util:properties id="gemfireProperties">
>>>>   <prop key="name">Example</prop>
>>>>   <!-- Set to non-zero value to use Multicast; comment out "locators"
>>>> -->
>>>>   <prop key="*mcast-port*">0</prop>
>>>>   <prop key="log-level">${gemfire.log-level:config}</prop>
>>>>   <prop key=“*locators*”>someHost[10334]</prop>
>>>>   <prop key="start-locator">localhost[1034]</prop>
>>>> </util:properties>
>>>>
>>>> <gfe:cache properties-ref="gemfireProperties"/>
>>>>
>>>> ...
>>>>
>>>>
>>>> As you can see from the snippet of *Spring* XML config above, this
>>>> application is a Geode "peer" cache (i.e. embeds a Geode data node/server).
>>>>
>>>> The "*locators*" Geode/GemFire property enables this node to connect
>>>> to a cluster.  Likewise, you can use the "*mcast-port*" property
>>>> instead, however, I would recommend *Locators* over mcast.
>>>>
>>>> Additionally, you can see that I specified the "start-locator"
>>>> Geode/GemFire property, which enables me to start an embedded Locator.
>>>> Useful for testing purposes and connecting Geode data nodes together in a
>>>> cluster without a dedicated Locator, though, this approach is less
>>>> resilient if the applications/servers go down (as may be the case in a
>>>> micro-services scenario)!
>>>>
>>>>
>>>> *> if I start with embedded server is it required to use client pool or
>>>> is it not required?*
>>>>
>>>> A "client pool" is only applicable to cache clients (i.e. ClientCaches)
>>>> on the "client-side" of the equation.  "peers" find (Locator, mcast) and
>>>> communicate (TCP/UDP, JGroups) with each other through other means once a
>>>> cluster is formed.
>>>>
>>>> In fact, typically, it is more common to position your
>>>> microservices-based applications as Geode cache clients (i.e. <gfe:client-cache
>>>> ...>) and have them connect to a dedicated Geode service (i.e. cluster
>>>> of Geode servers/data nodes where also, 1 or more of those nodes are
>>>> running a "CacheServer", listening for cache clients to connect).
>>>> These dedicated Geode server nodes in a cluster constituting the service
>>>> can still be configured with *Spring*, but they typically will not
>>>> contain an application-specific components other than CacheListeners,
>>>> Loaders, Writers, AEQ *Listeners*, etc.
>>>>
>>>> ClientCache applications use 1 or more Pools configured to talk to the
>>>> servers in the cluster (either by way of Locator or direct server
>>>> communication). Pools can be configured with groups to target specific
>>>> members (in that group) in the cluster.  Typically, members in 1 group host
>>>> a different set of Regions from another group and is a way to separate data
>>>> traffic from 1 client to another dedicated to a specific resource/purpose
>>>> (usually based on business function, etc).
>>>>
>>>> On a side note, some of what you are wanting to do "scale-wise" seems
>>>> like a perfect fit for Pivotal CloudFoundry, which can auto-scale up or
>>>> down nodes in your cluster based on load and other factors.
>>>>
>>>> Anyway, hope this helps!
>>>>
>>>> -John
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> [1] http://docs.spring.io/spring/docs/current/spring-framewo
>>>> rk-reference/htmlsingle/#beans-factory-extension-bpp
>>>> [2] http://geode.apache.org/docs/guide/basic_config/the_cach
>>>> e/setting_cache_initializer.html
>>>>
>>>>
>>>> On Sun, Dec 25, 2016 at 11:12 PM, Amit Pandey <
>>>> amit.pandey2103@gmail.com> wrote:
>>>>
>>>>> Hey,
>>>>>
>>>>> Thanks.
>>>>>
>>>>> I have lots of reference data which will be loaded at start of day.
>>>>> This data is not bound to change much and as such I want to keep it loaded
>>>>> at the start of day. Read through will make it slow while it is being
>>>>> actually accessed so I want to keep it loaded in memory.
>>>>>
>>>>> Also I want to have functions which will be called by clients to do
>>>>> some compute and return results. Using functions should allow me to add
>>>>> nodes and speed up the compute.
>>>>>
>>>>> I have some micro services each of which will start a gemfire node,
>>>>> and I want to connect, so yes I can set it up with locator.
>>>>>
>>>>> However I have one doubt, if I start with embedded server is it
>>>>> required to use client pool or is it not required?
>>>>>
>>>>> Regards
>>>>>
>>>>> On Mon, Dec 26, 2016 at 1:18 AM, Udo Kohlmeyer <uk...@pivotal.io>
>>>>> wrote:
>>>>>
>>>>>> Hi there Amit,
>>>>>>
>>>>>> At this stage the only way you could load all data at one go is to
>>>>>> write a client to connect to the db and load all in. Another approach could
>>>>>> be to write the same code into a function and invoke the function at start
>>>>>> up. But in both cases both are manual.
>>>>>>
>>>>>> To have geode servers join a cluster, you have 2 ways.
>>>>>>
>>>>>>    1. Connecting them up via a locator
>>>>>>    2. Connecting them up via mcast.
>>>>>>
>>>>>> Please be aware the once you connect a server to a cluster, that
>>>>>> server becomes an integral part of the cluster so adding/removing servers
>>>>>> from a cluster is not something you'd want to do in a load-based scaling
>>>>>> model. i.e if the load is high, add a server and if load is low, shut down
>>>>>> a server.
>>>>>>
>>>>>> Just interest sake, what is your use case.
>>>>>>
>>>>>> --Udo
>>>>>>
>>>>>> On 12/24/16 05:57, Amit Pandey wrote:
>>>>>>
>>>>>> Hi Guys,
>>>>>>
>>>>>> I am using Spring Data Geode. I have been able to use read and write
>>>>>> through/ write behind. I want to load all data on cache startup at a go.
>>>>>>
>>>>>> Secondly my geode server is embedded but I want to allow it join to
>>>>>> other nodes.  How should I set it up in config to allow it to join other
>>>>>> nodes in cluster?
>>>>>>
>>>>>> Regards
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> -John
>>>> john.blum10101 (skype)
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> -John
>>> john.blum10101 (skype)
>>>
>>
>>
>


-- 
Luke Shannon | Platform Engineering | Pivotal
-------------------------------------------------------------------------

Mobile:416-571-9495
Join the Toronto Pivotal Usergroup:
http://www.meetup.com/Toronto-Pivotal-User-Group/