You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@openjpa.apache.org by Jonas Petersen <jo...@mindfloaters.de> on 2008/04/30 21:25:22 UTC

Synchronizing two databases with the same model

Hi there!

We have one data model and we need two datastores with that same data 
model. Datastore A for editing and previewing and datastore B for 
production (live).

Now we need to synchronize parts of datastore A to datastore B.

The most obvious approach would be: fetch objects from datastore A (and 
possibly detach the objects) and then merge them in database B. But this 
rises a couple of problems due to versioning / sequence generators / 
optimistic locking / ...

e.g.:
- If objects (detached from datastore A) do not exist in datastore B, 
they are assumed deleted and an exception is thrown
- Since we're using the GeneratedValue annotation for ids, objects would 
not be able to get persisted in datastore B even if they were new.

Do you have any idea how to solve this problem in a regular way?

One (non JPA-)way would be to implement it with  native queries. Maybe 
this is the only way? Would probably be harder to maintain though.

Thanks for any suggestion!

Regards
Jonas

Re: Synchronizing two databases with the same model

Posted by Pinaki Poddar <pp...@apache.org>.
Hi,
  Here is a prototype for migrating data using OpenJPA attached as ~100KB
jar that *must* precede openjpa jars in the classpath. 

  The solution is based on OpenJPA module named 'Slice' which supports a
single EntityManager be connected to multiple databases (or slices) in the
*same* transaction. This notion is extended to create a specialized Slice
configuration which accepts two and only two databases as 'source' and
'target' slices and a specialized StoreManager that directs all read
operation to 'source' slice and write operation to 'target' slice. The
user-visible effect is of migration i.e. whatever data is read *and dirtied*
from the 'source' database is written to the 'target' database. 

  Because of the special nature of configuration, I named this configuration
as 'nomad' and hence the configuration for migration looks like
<persistence-unit name="migrate">
        <properties>
            <property name="openjpa.BrokerFactory" value="nomad"/>
            <property name="openjpa.ConnectionDriverName"
value="com.mysql.jdbc.Driver"/>
            <property name="openjpa.slice.source.ConnectionURL"
value="jdbc:mysql://localhost/data1"/>
            <property name="openjpa.slice.target.ConnectionURL"
value="jdbc:mysql://localhost/data2"/>
        </properties>
</persistence>

 Of course, the 'source' and 'target' slice can use different database
drivers etc. as per-slice configuration allows. The only thing to remember
is the slice names are hard coded as 'source' and 'target' and the
specialized BrokerFactory is aliased as 'nomad'.

  The behavior is exemplified by the attached JUnit Testcase.

http://www.nabble.com/file/p17330349/openjpa-nomad.jar openjpa-nomad.jar 
http://www.nabble.com/file/p17330349/persistence.xml persistence.xml 
http://www.nabble.com/file/p17330349/TestMigration.java TestMigration.java 
http://www.nabble.com/file/p17330349/PObject.java PObject.java  

   Note that this is a quick prototype and whoever plans to play with it
should be ready for exception stack traces.
-- 
View this message in context: http://www.nabble.com/Synchronizing-two-databases-with-the-same-model-tp16989856p17330349.html
Sent from the OpenJPA Users mailing list archive at Nabble.com.


Re: Synchronizing two databases with the same model

Posted by Pinaki Poddar <pp...@apache.org>.
Hi Jonas,
  Just finished a prototype that specializes Slice to migrate selected
instances from one database to another. But I am not planning to commit it
to trunk right now (mainly because the trick that made it work will require
a single line but dangerous change in OpenJPA :) -- if you are interested I
can send you a small jar (~20KB) that you can play with and give us some
feedback on feasibility of this approach. 

  Additional information on Slice is available at [1][2][3] and also is
documented on SVN trunk. 

[1] http://people.apache.org/~ppoddar/slice/site/index.html
[2]
http://dev2dev.bea.com/blog/pinaki.poddar/archive/2007/12/slice_openjpa_f.html
[3]
http://dev2dev.bea.com/blog/pinaki.poddar/archive/2008/01/slice_openjpa_f_1.html


Jonas Petersen wrote:
> 
> Hi Pinaki,
> 
> thank you, this sounds interesting. The Slice Plig-In looks quite 
> promising, I'm definitely interested in learning more about this.
> 
> Jonas
> 
> Pinaki Poddar schrieb:
>> Hi,
>>   
>>> The most obvious approach would be: fetch objects from datastore A (and 
>>> possibly detach the objects) and then merge them in database B. But this 
>>> rises a couple of problems due to versioning / sequence generators / 
>>> optimistic locking / ...
>>>     
>>
>>   Another alternative approach is to consider is a single StoreManager
>> with
>> two database connections: one is 'read' connection to 'source' database
>> and
>> the other is 'write' connection to 'target' database. The JPA
>> EntityManager
>> interface remains intact but, below the hood, all 'read' operations
>> happen
>> on 'source' database while any modification is written to target
>> database. 
>>
>>   I prefer this approach because then all the instances are managed by
>> the
>> same persistence context; rather than being realized in one context,
>> detached and then merged onto another. Also the 'migration application'
>> becomes simple. One can issue a query, dirty all the selected instance
>> and
>> then commit. The effect will be migrating all the selected objects from
>> the
>> 'source' database to 'target' database.
>>    
>>
>>   Now OpenJPA Slice module already has some support to handle multiple
>> databases in a same persistence context. I tweaked Slice a bit to get the
>> 'migration' feature as described above.  
>>
>>   If you are interested to explore this further, let me know.
>>
>>
>>   
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Synchronizing-two-databases-with-the-same-model-tp16989856p17116073.html
Sent from the OpenJPA Users mailing list archive at Nabble.com.


Re: Synchronizing two databases with the same model

Posted by Jonas Petersen <jo...@mindfloaters.de>.
Hi Pinaki,

thank you, this sounds interesting. The Slice Plig-In looks quite 
promising, I'm definitely interested in learning more about this.

Jonas

Pinaki Poddar schrieb:
> Hi,
>   
>> The most obvious approach would be: fetch objects from datastore A (and 
>> possibly detach the objects) and then merge them in database B. But this 
>> rises a couple of problems due to versioning / sequence generators / 
>> optimistic locking / ...
>>     
>
>   Another alternative approach is to consider is a single StoreManager with
> two database connections: one is 'read' connection to 'source' database and
> the other is 'write' connection to 'target' database. The JPA EntityManager
> interface remains intact but, below the hood, all 'read' operations happen
> on 'source' database while any modification is written to target database. 
>
>   I prefer this approach because then all the instances are managed by the
> same persistence context; rather than being realized in one context,
> detached and then merged onto another. Also the 'migration application'
> becomes simple. One can issue a query, dirty all the selected instance and
> then commit. The effect will be migrating all the selected objects from the
> 'source' database to 'target' database.
>    
>
>   Now OpenJPA Slice module already has some support to handle multiple
> databases in a same persistence context. I tweaked Slice a bit to get the
> 'migration' feature as described above.  
>
>   If you are interested to explore this further, let me know.
>
>
>   


RE: Synchronizing two databases with the same model

Posted by Brill Pappin <br...@pappin.ca>.
I haven't really been following the thread, but...
I just thought of an old project that might still be active... 
Something called CJDBC which was an early attempt to provide a driver level
replication, cluster and fail-over feature.

If it's still around, it might be able do your replication.

A quick search pulled up this, which has a news item for May 2008, so it
looks like its still active
http://c-jdbc.objectweb.org/

- Brill

-----Original Message-----
From: Pinaki Poddar [mailto:ppoddar@apache.org] 
Sent: Wednesday, May 07, 2008 3:28 PM
To: users@openjpa.apache.org
Subject: Re: Synchronizing two databases with the same model


Hi,
> The most obvious approach would be: fetch objects from datastore A 
> (and possibly detach the objects) and then merge them in database B. 
> But this rises a couple of problems due to versioning / sequence 
> generators / optimistic locking / ...

  Another alternative approach is to consider is a single StoreManager with
two database connections: one is 'read' connection to 'source' database and
the other is 'write' connection to 'target' database. The JPA EntityManager
interface remains intact but, below the hood, all 'read' operations happen
on 'source' database while any modification is written to target database. 

  I prefer this approach because then all the instances are managed by the
same persistence context; rather than being realized in one context,
detached and then merged onto another. Also the 'migration application'
becomes simple. One can issue a query, dirty all the selected instance and
then commit. The effect will be migrating all the selected objects from the
'source' database to 'target' database.
   

  Now OpenJPA Slice module already has some support to handle multiple
databases in a same persistence context. I tweaked Slice a bit to get the
'migration' feature as described above.  

  If you are interested to explore this further, let me know.


--
View this message in context:
http://www.nabble.com/Synchronizing-two-databases-with-the-same-model-tp1698
9856p17112416.html
Sent from the OpenJPA Users mailing list archive at Nabble.com.


Re: Synchronizing two databases with the same model

Posted by Pinaki Poddar <pp...@apache.org>.
Hi,
> The most obvious approach would be: fetch objects from datastore A (and 
> possibly detach the objects) and then merge them in database B. But this 
> rises a couple of problems due to versioning / sequence generators / 
> optimistic locking / ...

  Another alternative approach is to consider is a single StoreManager with
two database connections: one is 'read' connection to 'source' database and
the other is 'write' connection to 'target' database. The JPA EntityManager
interface remains intact but, below the hood, all 'read' operations happen
on 'source' database while any modification is written to target database. 

  I prefer this approach because then all the instances are managed by the
same persistence context; rather than being realized in one context,
detached and then merged onto another. Also the 'migration application'
becomes simple. One can issue a query, dirty all the selected instance and
then commit. The effect will be migrating all the selected objects from the
'source' database to 'target' database.
   

  Now OpenJPA Slice module already has some support to handle multiple
databases in a same persistence context. I tweaked Slice a bit to get the
'migration' feature as described above.  

  If you are interested to explore this further, let me know.


-- 
View this message in context: http://www.nabble.com/Synchronizing-two-databases-with-the-same-model-tp16989856p17112416.html
Sent from the OpenJPA Users mailing list archive at Nabble.com.


Re: Synchronizing two databases with the same model

Posted by Jonas Petersen <jo...@mindfloaters.de>.
Hi Michael,

thank you for the response! I'm afraid that this won't help much. It's 
about notification. If I get notified I still have to synchronize.

Further it "allows a subset of the information available through 
OpenJPA's transaction events to be broadcast to remote listeners". And 
"...that will be alerted with a list of modified object ids whenever a 
transaction on a remote machine successfully commits". So it's tied to a 
transaction? What we need is to synchronize a certain state of the db 
(or part of it). And that could be the essence of many transactions.

Regards
Jonas

Michael Vorburger schrieb:
> Jonas,
>
> I wonder if the OpenJPA "Remote and Offline Operation" stuff
> (http://openjpa.apache.org/docs/latest/manual/ref_guide_event.html) may
> allow you to build what you're after...
>
> Regards,
> Michael
>   
>


RE: Synchronizing two databases with the same model

Posted by Michael Vorburger <mv...@odyssey-group.com>.
Jonas,

I wonder if the OpenJPA "Remote and Offline Operation" stuff
(http://openjpa.apache.org/docs/latest/manual/ref_guide_event.html) may
allow you to build what you're after...

Regards,
Michael


-----Original Message-----
From: Andy Schlaikjer [mailto:hazen+@cs.cmu.edu] 
Sent: jeudi, 1. mai 2008 18:44
To: users@openjpa.apache.org
Subject: Re: Synchronizing two databases with the same model

Jonas,

I'm glad you asked this question as I'd also been thinking about how I
might get around restrictions the @GeneratedValue annotation enforces
within OpenJPA.

In certain circumstances I need to specify the value of a field marked
with @GeneratedValue explicitly when persisting a new entity instance. 
I'd hoped that perhaps "merge" semantics would differ from "persist" 
semantics with respect to this constraint, but it seems (quite
logically) that the constraint is applied uniformly to all entity
life-cycle operations.

For the time being I've had to fall back on JDBC to persist new data,
but it'd be great if there were a way to signal to OpenJPA that a field
marked with GeneratedValue may be explicitly defined for certain
operations, like merge or persist.

Andy

Jonas Petersen wrote:
> Hi Brill,
> 
> thanks for replying. The thing is, that only certain parts (e.g. an 
> entity with certain id including child objects) have to get 
> synchronized at the time by demand of a content editor.
> 
> The datastore is mysql 5.
> 
> Jonas
> 
> Brill Pappin schrieb:
>> Actually, the most obvious approach is not to write some special 
>> code, but simply enable replication and don't worry about trying to 
>> get OJPA to sync.
>> What kind of database is it (most popular db's have replication of 
>> one sort or another)?
>>
>> - Brill Pappin
>>
>> -----Original Message-----
>> From: Jonas Petersen [mailto:jonas@mindfloaters.de] Sent: Wednesday, 
>> April 30, 2008 3:25 PM
>> To: users@openjpa.apache.org
>> Subject: Synchronizing two databases with the same model
>>
>> Hi there!
>>
>> We have one data model and we need two datastores with that same data

>> model.
>> Datastore A for editing and previewing and datastore B for production

>> (live).
>>
>> Now we need to synchronize parts of datastore A to datastore B.
>>
>> The most obvious approach would be: fetch objects from datastore A 
>> (and possibly detach the objects) and then merge them in database B. 
>> But this rises a couple of problems due to versioning / sequence 
>> generators / optimistic locking / ...
>>
>> e.g.:
>> - If objects (detached from datastore A) do not exist in datastore B,

>> they are assumed deleted and an exception is thrown
>> - Since we're using the GeneratedValue annotation for ids, objects 
>> would not be able to get persisted in datastore B even if they were 
>> new.
>>
>> Do you have any idea how to solve this problem in a regular way?
>>
>> One (non JPA-)way would be to implement it with  native queries. 
>> Maybe this is the only way? Would probably be harder to maintain 
>> though.
>>
>> Thanks for any suggestion!
>>
>> Regards
>> Jonas
>>
>>
>>   
> 
> 

____________________________________________________________

� This email and any files transmitted with it are CONFIDENTIAL and intended
  solely for the use of the individual or entity to which they are addressed.
� Any unauthorized copying, disclosure, or distribution of the material within
  this email is strictly forbidden.
� Any views or opinions presented within this e-mail are solely those of the
  author and do not necessarily represent those of Odyssey Financial
Technologies SA unless otherwise specifically stated.
� An electronic message is not binding on its sender. Any message referring to
  a binding engagement must be confirmed in writing and duly signed.
� If you have received this email in error, please notify the sender immediately
  and delete the original.

Re: Synchronizing two databases with the same model

Posted by Ognjen Blagojevic <og...@etf.bg.ac.yu>.
I agree. I also need to fallback to native SQL when migrating data, just 
because we would like to preserve primary keys.

Regards,
Ognjen


Andy Schlaikjer wrote:
> Jonas,
> 
> I'm glad you asked this question as I'd also been thinking about how I 
> might get around restrictions the @GeneratedValue annotation enforces 
> within OpenJPA.
> 
> In certain circumstances I need to specify the value of a field marked 
> with @GeneratedValue explicitly when persisting a new entity instance. 
> I'd hoped that perhaps "merge" semantics would differ from "persist" 
> semantics with respect to this constraint, but it seems (quite 
> logically) that the constraint is applied uniformly to all entity 
> life-cycle operations.
> 
> For the time being I've had to fall back on JDBC to persist new data, 
> but it'd be great if there were a way to signal to OpenJPA that a field 
> marked with GeneratedValue may be explicitly defined for certain 
> operations, like merge or persist.
> 
> Andy
> 
> Jonas Petersen wrote:
>> Hi Brill,
>>
>> thanks for replying. The thing is, that only certain parts (e.g. an 
>> entity with certain id including child objects) have to get 
>> synchronized at the time by demand of a content editor.
>>
>> The datastore is mysql 5.
>>
>> Jonas
>>
>> Brill Pappin schrieb:
>>> Actually, the most obvious approach is not to write some special 
>>> code, but
>>> simply enable replication and don't worry about trying to get OJPA to 
>>> sync.
>>> What kind of database is it (most popular db's have replication of 
>>> one sort
>>> or another)?
>>>
>>> - Brill Pappin
>>>
>>> -----Original Message-----
>>> From: Jonas Petersen [mailto:jonas@mindfloaters.de] Sent: Wednesday, 
>>> April 30, 2008 3:25 PM
>>> To: users@openjpa.apache.org
>>> Subject: Synchronizing two databases with the same model
>>>
>>> Hi there!
>>>
>>> We have one data model and we need two datastores with that same data 
>>> model.
>>> Datastore A for editing and previewing and datastore B for production
>>> (live).
>>>
>>> Now we need to synchronize parts of datastore A to datastore B.
>>>
>>> The most obvious approach would be: fetch objects from datastore A (and
>>> possibly detach the objects) and then merge them in database B. But this
>>> rises a couple of problems due to versioning / sequence generators /
>>> optimistic locking / ...
>>>
>>> e.g.:
>>> - If objects (detached from datastore A) do not exist in datastore B, 
>>> they
>>> are assumed deleted and an exception is thrown
>>> - Since we're using the GeneratedValue annotation for ids, objects 
>>> would not
>>> be able to get persisted in datastore B even if they were new.
>>>
>>> Do you have any idea how to solve this problem in a regular way?
>>>
>>> One (non JPA-)way would be to implement it with  native queries. 
>>> Maybe this
>>> is the only way? Would probably be harder to maintain though.
>>>
>>> Thanks for any suggestion!
>>>
>>> Regards
>>> Jonas
>>>
>>>
>>>   
>>
>>
> 


Re: Synchronizing two databases with the same model

Posted by Andy Schlaikjer <ha...@cs.cmu.edu>.
Jonas,

I'm glad you asked this question as I'd also been thinking about how I 
might get around restrictions the @GeneratedValue annotation enforces 
within OpenJPA.

In certain circumstances I need to specify the value of a field marked 
with @GeneratedValue explicitly when persisting a new entity instance. 
I'd hoped that perhaps "merge" semantics would differ from "persist" 
semantics with respect to this constraint, but it seems (quite 
logically) that the constraint is applied uniformly to all entity 
life-cycle operations.

For the time being I've had to fall back on JDBC to persist new data, 
but it'd be great if there were a way to signal to OpenJPA that a field 
marked with GeneratedValue may be explicitly defined for certain 
operations, like merge or persist.

Andy

Jonas Petersen wrote:
> Hi Brill,
> 
> thanks for replying. The thing is, that only certain parts (e.g. an 
> entity with certain id including child objects) have to get synchronized 
> at the time by demand of a content editor.
> 
> The datastore is mysql 5.
> 
> Jonas
> 
> Brill Pappin schrieb:
>> Actually, the most obvious approach is not to write some special code, 
>> but
>> simply enable replication and don't worry about trying to get OJPA to 
>> sync.
>> What kind of database is it (most popular db's have replication of one 
>> sort
>> or another)?
>>
>> - Brill Pappin
>>
>> -----Original Message-----
>> From: Jonas Petersen [mailto:jonas@mindfloaters.de] Sent: Wednesday, 
>> April 30, 2008 3:25 PM
>> To: users@openjpa.apache.org
>> Subject: Synchronizing two databases with the same model
>>
>> Hi there!
>>
>> We have one data model and we need two datastores with that same data 
>> model.
>> Datastore A for editing and previewing and datastore B for production
>> (live).
>>
>> Now we need to synchronize parts of datastore A to datastore B.
>>
>> The most obvious approach would be: fetch objects from datastore A (and
>> possibly detach the objects) and then merge them in database B. But this
>> rises a couple of problems due to versioning / sequence generators /
>> optimistic locking / ...
>>
>> e.g.:
>> - If objects (detached from datastore A) do not exist in datastore B, 
>> they
>> are assumed deleted and an exception is thrown
>> - Since we're using the GeneratedValue annotation for ids, objects 
>> would not
>> be able to get persisted in datastore B even if they were new.
>>
>> Do you have any idea how to solve this problem in a regular way?
>>
>> One (non JPA-)way would be to implement it with  native queries. Maybe 
>> this
>> is the only way? Would probably be harder to maintain though.
>>
>> Thanks for any suggestion!
>>
>> Regards
>> Jonas
>>
>>
>>   
> 
> 

Re: Synchronizing two databases with the same model

Posted by Jonas Petersen <jo...@mindfloaters.de>.
Hi Brill,

thanks for replying. The thing is, that only certain parts (e.g. an 
entity with certain id including child objects) have to get synchronized 
at the time by demand of a content editor.

The datastore is mysql 5.

Jonas

Brill Pappin schrieb:
> Actually, the most obvious approach is not to write some special code, but
> simply enable replication and don't worry about trying to get OJPA to sync.
> What kind of database is it (most popular db's have replication of one sort
> or another)?
>
> - Brill Pappin
>
> -----Original Message-----
> From: Jonas Petersen [mailto:jonas@mindfloaters.de] 
> Sent: Wednesday, April 30, 2008 3:25 PM
> To: users@openjpa.apache.org
> Subject: Synchronizing two databases with the same model
>
> Hi there!
>
> We have one data model and we need two datastores with that same data model.
> Datastore A for editing and previewing and datastore B for production
> (live).
>
> Now we need to synchronize parts of datastore A to datastore B.
>
> The most obvious approach would be: fetch objects from datastore A (and
> possibly detach the objects) and then merge them in database B. But this
> rises a couple of problems due to versioning / sequence generators /
> optimistic locking / ...
>
> e.g.:
> - If objects (detached from datastore A) do not exist in datastore B, they
> are assumed deleted and an exception is thrown
> - Since we're using the GeneratedValue annotation for ids, objects would not
> be able to get persisted in datastore B even if they were new.
>
> Do you have any idea how to solve this problem in a regular way?
>
> One (non JPA-)way would be to implement it with  native queries. Maybe this
> is the only way? Would probably be harder to maintain though.
>
> Thanks for any suggestion!
>
> Regards
> Jonas
>
>
>   


RE: Synchronizing two databases with the same model

Posted by Brill Pappin <br...@pappin.ca>.
Actually, the most obvious approach is not to write some special code, but
simply enable replication and don't worry about trying to get OJPA to sync.
What kind of database is it (most popular db's have replication of one sort
or another)?

- Brill Pappin

-----Original Message-----
From: Jonas Petersen [mailto:jonas@mindfloaters.de] 
Sent: Wednesday, April 30, 2008 3:25 PM
To: users@openjpa.apache.org
Subject: Synchronizing two databases with the same model

Hi there!

We have one data model and we need two datastores with that same data model.
Datastore A for editing and previewing and datastore B for production
(live).

Now we need to synchronize parts of datastore A to datastore B.

The most obvious approach would be: fetch objects from datastore A (and
possibly detach the objects) and then merge them in database B. But this
rises a couple of problems due to versioning / sequence generators /
optimistic locking / ...

e.g.:
- If objects (detached from datastore A) do not exist in datastore B, they
are assumed deleted and an exception is thrown
- Since we're using the GeneratedValue annotation for ids, objects would not
be able to get persisted in datastore B even if they were new.

Do you have any idea how to solve this problem in a regular way?

One (non JPA-)way would be to implement it with  native queries. Maybe this
is the only way? Would probably be harder to maintain though.

Thanks for any suggestion!

Regards
Jonas