You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@openjpa.apache.org by Abe White <aw...@bea.com> on 2008/04/04 19:22:18 UTC

Merge strategies and OPENJPA-245

OpenJPA typically uses an enhancer-added "detached state field" to
differentiate newly-constructed instances that need to be inserted vs.
detached instances that need to be updated on merge.  This is covered in
more detail in the user manual.

OpenJPA also allows users with version fields to manually construct a
"detached" instance just by setting the pk fields to the pks of an
existing instance and setting the version field appropriately.  A
non-default version field value is considered a directive to treat an
instance as existing on merge, even if the detached state is missing.

The OPENJPA-245 JIRA issue describes a case in which a user wants to
manually construct a "detached" instance instead of going through the
detach procedure, but the user's entity doesn't have a version field.  A
fix was submitted in January which basically changes our merge algorithm
to always issue a database query to see if an instance with the same pks
exists when no detached state is present.

This has a few side effects:  
1. This fix makes merging much more database-intensive, as we have to
issue queries for all instances without detached state.  Though it might
not matter to OpenJPA devs, JDO actually treats persisting and merging
the same, so the inefficiency is even more pronounced in JDO.
2. With this fix, the results of OpenJPAEntityManager.isDetached (and
the methods it relies on like PersistenceCapable.pcIsDetached and
Broker.isDetached) are no longer considered reliable -- the
VersionAttachStrategy doesn't trust these results, but does its own
lookup in the DB.
3. The fix changes the place that many exceptions will be thrown from
flush/commit to merge.  This is allowed by the spec and is arguably
better (exceptions are thrown earlier), but I just want to point out
that it is a change. 

I think we can fix 245 in a way that lessens the side effects.  A couple
options:

1. There is existing code in PersistenceCapable.pcIsDetached to consider
an instance detached if it has any auto-assigned pk fields that aren't
set to their default values.  Right now this code is only executed if
the user has configured things to never use a detached state field.  We
could easily change things, however, so that a non-default auto-assigned
pk field is treated like a non-default version field: it always signals
a detached instance, even when detached state is enabled but missing.
This fixes 245 (which happened to involve an entity with auto-assigned
pk fields) and removes all 3 side effects above.  

The downside is that if you have an entity without a version field and
without auto-assigned pk fields and you attempt to construct a new
"detached" instance and merge it instead of actually going through the
detach process, OpenJPA will (correctly, but not desirably) think the
instance is new rather than detached.  

This is a pretty esoteric use case.  But even in this case, users have
to option of turning off the detached state field via configuration, and
OpenJPA will fall back on doing a DB query for the pk fields to
determine whether these instances are detached vs. new.  So there is
still a way to handle this case, you just have to set a config property.

2. Another option would be to change VersionAttachStrategy to only do a
DB lookup for types that have no version field.  This would alleviate
the inefficiencies for users that have version fields.  Unfortunately,
it wouldn't change any of the side effects of the existing fix for any
other users.

========

I would vote for proposal #1 above.  It returns us to very efficient
merge behavior (and JDO persist behavior) while allowing most users
(those with either version fields or auto-assigned pk fields) to still
manually construct and merge "detached" instances.  The only users left
out of the fix are those who want to manually construct and merge
"detached" instances without version fields or auto-assigned pk fields.
Even these users can turn off detached state, though, to force us to do
DB lookups when determining whether an instance is detached vs. new.

Thoughts?


Notice:  This email message, together with any attachments, may contain information  of  BEA Systems,  Inc.,  its subsidiaries  and  affiliated entities,  that may be confidential,  proprietary,  copyrighted  and/or legally privileged, and is intended solely for the use of the individual or entity named in this message. If you are not the intended recipient, and have received this message in error, please immediately return this by email and then delete it.

Re: Merge strategies and OPENJPA-245

Posted by Michael Dick <mi...@gmail.com>.
Hi Abe,

I like this approach better - since we already have at least one customer
hitting the corner case. The caveats seem reasonable to me, but then again
I'm not working on Kodo ;-).

Does anyone else object, or have other concerns?

-Mike

On Fri, Apr 4, 2008 at 3:41 PM, Abe White <aw...@bea.com> wrote:

> Here's another option that might suit everyone:
>
> - We change the enhancement so that a non-default auto-assigned pk field
> makes an instance detached even in the absence of detached state, as I
> proposed.
> - If the entity doesn't use auto-assigned pk fields (and has no version
> field) and detached state is in use but not present,
> PersistenceCapable.pcIsDetached will now return null for JPA enhanced
> entities.  We already treat null as "I don't know", and it causes us to
> go to the DB to see if there is an existing record.  For JDO enhanced
> entities, we'll return false as before.  We can do this by returning the
> result of a new private pcIsDetachedStateDefinitive method we'll add,
> and the JDO enhancer will change the return value of this method from
> null to false.  This won't cause any compat problems with
> already-enhanced entities, because it won't be part of the public
> PersistenceCapable interface.
>
> This way the corner case you're concerned about is covered without
> having to change the default config, and meanwhile JDO users still get
> efficient persist behavior.
>
> The downsides are:
> - Merging new version-less and auto-assigned-pk-less entities (i.e. ones
> you really do want inserted, not updated, but are found during merge
> instead of passed to persist) still causes a DB lookup to make sure
> there is no existing entity with the same PKs.  I think this is so
> infrequent as to be perfectly OK.
> - If a user upgrades Kodo without reenhancing OPENJPA-245 won't be
> "fixed" for him.  Then when he reenhances the behavior will change and
> it will act fixed.  Kind of odd, but I can live with it.
>
>
> On Fri, 2008-04-04 at 14:26 -0500, Michael Dick wrote:
> > Hi Abe,
> >
> > I'm guilty of not updating 245 after I finished it. The scenario that
> got me
> > to look into it is the edge case you mentioned - the entity has no auto
> > generated fields.
> >
> > Improving the story for other scenarios sounds good to me and I like the
> > approach.
> >
> > My main concern is that we don't break the edge case I mentioned.
> Something
> > like this should still work :
> >
> > class Person {
> >     @Id
> >     int id;
> >
> >     String name;
> >
> >     // setters etc.
> > }
> >
> > Person p = new Person();
> > p.setId(1);
> > p.setName("Mike");
> > em.persist(p);
> > // commit tran
> >
> > // later
> > p = new Person();  // new entity - matches an existing row.
> > p2.setId(1);
> > p.setName("NotMike");
> > em.merge(p);
> > // commit tran
> >
> > Under proposal #1 I'd have to add something like this to persistence.xml
> > <property name="openjpa.DetachState"
> > value="fetch-groups(DetachedStateField=false)"/>
> >
> > With proposal #2 I wouldn't have to do anything, but it improves
> performance
> > for a smaller set of users.
> >
> > I'm in favor of the first approach. Many users who want this kind of
> > processing will set the openjpaDetachState property to change loaded to
> > fetch-groups or all (otherwise you lose the ability to set a field to
> null).
>
>
> Notice:  This email message, together with any attachments, may contain
> information  of  BEA Systems,  Inc.,  its subsidiaries  and  affiliated
> entities,  that may be confidential,  proprietary,  copyrighted  and/or
> legally privileged, and is intended solely for the use of the individual or
> entity named in this message. If you are not the intended recipient, and
> have received this message in error, please immediately return this by email
> and then delete it.
>

Re: Merge strategies and OPENJPA-245

Posted by Abe White <aw...@bea.com>.
Here's another option that might suit everyone:

- We change the enhancement so that a non-default auto-assigned pk field
makes an instance detached even in the absence of detached state, as I
proposed.
- If the entity doesn't use auto-assigned pk fields (and has no version
field) and detached state is in use but not present,
PersistenceCapable.pcIsDetached will now return null for JPA enhanced
entities.  We already treat null as "I don't know", and it causes us to
go to the DB to see if there is an existing record.  For JDO enhanced
entities, we'll return false as before.  We can do this by returning the
result of a new private pcIsDetachedStateDefinitive method we'll add,
and the JDO enhancer will change the return value of this method from
null to false.  This won't cause any compat problems with
already-enhanced entities, because it won't be part of the public
PersistenceCapable interface.

This way the corner case you're concerned about is covered without
having to change the default config, and meanwhile JDO users still get
efficient persist behavior.

The downsides are:
- Merging new version-less and auto-assigned-pk-less entities (i.e. ones
you really do want inserted, not updated, but are found during merge
instead of passed to persist) still causes a DB lookup to make sure
there is no existing entity with the same PKs.  I think this is so
infrequent as to be perfectly OK.
- If a user upgrades Kodo without reenhancing OPENJPA-245 won't be
"fixed" for him.  Then when he reenhances the behavior will change and
it will act fixed.  Kind of odd, but I can live with it. 


On Fri, 2008-04-04 at 14:26 -0500, Michael Dick wrote:
> Hi Abe,
> 
> I'm guilty of not updating 245 after I finished it. The scenario that got me
> to look into it is the edge case you mentioned - the entity has no auto
> generated fields.
> 
> Improving the story for other scenarios sounds good to me and I like the
> approach.
> 
> My main concern is that we don't break the edge case I mentioned. Something
> like this should still work :
> 
> class Person {
>     @Id
>     int id;
> 
>     String name;
> 
>     // setters etc.
> }
> 
> Person p = new Person();
> p.setId(1);
> p.setName("Mike");
> em.persist(p);
> // commit tran
> 
> // later
> p = new Person();  // new entity - matches an existing row.
> p2.setId(1);
> p.setName("NotMike");
> em.merge(p);
> // commit tran
> 
> Under proposal #1 I'd have to add something like this to persistence.xml
> <property name="openjpa.DetachState"
> value="fetch-groups(DetachedStateField=false)"/>
> 
> With proposal #2 I wouldn't have to do anything, but it improves performance
> for a smaller set of users.
> 
> I'm in favor of the first approach. Many users who want this kind of
> processing will set the openjpaDetachState property to change loaded to
> fetch-groups or all (otherwise you lose the ability to set a field to null).


Notice:  This email message, together with any attachments, may contain information  of  BEA Systems,  Inc.,  its subsidiaries  and  affiliated entities,  that may be confidential,  proprietary,  copyrighted  and/or legally privileged, and is intended solely for the use of the individual or entity named in this message. If you are not the intended recipient, and have received this message in error, please immediately return this by email and then delete it.

Re: Merge strategies and OPENJPA-245

Posted by Michael Dick <mi...@gmail.com>.
Hi Abe,

I'm guilty of not updating 245 after I finished it. The scenario that got me
to look into it is the edge case you mentioned - the entity has no auto
generated fields.

Improving the story for other scenarios sounds good to me and I like the
approach.

My main concern is that we don't break the edge case I mentioned. Something
like this should still work :

class Person {
    @Id
    int id;

    String name;

    // setters etc.
}

Person p = new Person();
p.setId(1);
p.setName("Mike");
em.persist(p);
// commit tran

// later
p = new Person();  // new entity - matches an existing row.
p2.setId(1);
p.setName("NotMike");
em.merge(p);
// commit tran

Under proposal #1 I'd have to add something like this to persistence.xml
<property name="openjpa.DetachState"
value="fetch-groups(DetachedStateField=false)"/>

With proposal #2 I wouldn't have to do anything, but it improves performance
for a smaller set of users.

I'm in favor of the first approach. Many users who want this kind of
processing will set the openjpaDetachState property to change loaded to
fetch-groups or all (otherwise you lose the ability to set a field to null).




On Fri, Apr 4, 2008 at 12:22 PM, Abe White <aw...@bea.com> wrote:

> OpenJPA typically uses an enhancer-added "detached state field" to
> differentiate newly-constructed instances that need to be inserted vs.
> detached instances that need to be updated on merge.  This is covered in
> more detail in the user manual.
>
> OpenJPA also allows users with version fields to manually construct a
> "detached" instance just by setting the pk fields to the pks of an
> existing instance and setting the version field appropriately.  A
> non-default version field value is considered a directive to treat an
> instance as existing on merge, even if the detached state is missing.
>
> The OPENJPA-245 JIRA issue describes a case in which a user wants to
> manually construct a "detached" instance instead of going through the
> detach procedure, but the user's entity doesn't have a version field.  A
> fix was submitted in January which basically changes our merge algorithm
> to always issue a database query to see if an instance with the same pks
> exists when no detached state is present.
>
> This has a few side effects:
> 1. This fix makes merging much more database-intensive, as we have to
> issue queries for all instances without detached state.  Though it might
> not matter to OpenJPA devs, JDO actually treats persisting and merging
> the same, so the inefficiency is even more pronounced in JDO.
> 2. With this fix, the results of OpenJPAEntityManager.isDetached (and
> the methods it relies on like PersistenceCapable.pcIsDetached and
> Broker.isDetached) are no longer considered reliable -- the
> VersionAttachStrategy doesn't trust these results, but does its own
> lookup in the DB.
> 3. The fix changes the place that many exceptions will be thrown from
> flush/commit to merge.  This is allowed by the spec and is arguably
> better (exceptions are thrown earlier), but I just want to point out
> that it is a change.
>
> I think we can fix 245 in a way that lessens the side effects.  A couple
> options:
>
> 1. There is existing code in PersistenceCapable.pcIsDetached to consider
> an instance detached if it has any auto-assigned pk fields that aren't
> set to their default values.  Right now this code is only executed if
> the user has configured things to never use a detached state field.  We
> could easily change things, however, so that a non-default auto-assigned
> pk field is treated like a non-default version field: it always signals
> a detached instance, even when detached state is enabled but missing.
> This fixes 245 (which happened to involve an entity with auto-assigned
> pk fields) and removes all 3 side effects above.
>
> The downside is that if you have an entity without a version field and
> without auto-assigned pk fields and you attempt to construct a new
> "detached" instance and merge it instead of actually going through the
> detach process, OpenJPA will (correctly, but not desirably) think the
> instance is new rather than detached.
>
> This is a pretty esoteric use case.  But even in this case, users have
> to option of turning off the detached state field via configuration, and
> OpenJPA will fall back on doing a DB query for the pk fields to
> determine whether these instances are detached vs. new.  So there is
> still a way to handle this case, you just have to set a config property.
>
> 2. Another option would be to change VersionAttachStrategy to only do a
> DB lookup for types that have no version field.  This would alleviate
> the inefficiencies for users that have version fields.  Unfortunately,
> it wouldn't change any of the side effects of the existing fix for any
> other users.
>
> ========
>
> I would vote for proposal #1 above.  It returns us to very efficient
> merge behavior (and JDO persist behavior) while allowing most users
> (those with either version fields or auto-assigned pk fields) to still
> manually construct and merge "detached" instances.  The only users left
> out of the fix are those who want to manually construct and merge
> "detached" instances without version fields or auto-assigned pk fields.
> Even these users can turn off detached state, though, to force us to do
> DB lookups when determining whether an instance is detached vs. new.
>
> Thoughts?
>
>
> Notice:  This email message, together with any attachments, may contain
> information  of  BEA Systems,  Inc.,  its subsidiaries  and  affiliated
> entities,  that may be confidential,  proprietary,  copyrighted  and/or
> legally privileged, and is intended solely for the use of the individual or
> entity named in this message. If you are not the intended recipient, and
> have received this message in error, please immediately return this by email
> and then delete it.
>