You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@openjpa.apache.org by "Landers, Richard" <Ri...@ct.gov> on 2009/12/14 19:17:14 UTC

EntityManager fetch and query by key behave differently?

Hello everyone,
 
Can anyone clarify my understanding:
 
I have an entity ("Organization") that I often fetch by it's key (a
String).
 
In DB2 (v9.5 community) the ID field is a CHAR(10).
 
I've been using EntityManger.find(Class, String) to do the job for
months.
 
I was prettying-up URLs in my web layer when I noticed this behavior:
 
(1) When I pass in: "ORG-10    " (note the trailing spaces) I get an
entity with "ORG-10    " (with trailing spaces) as it's ID.
 
(2) When I pass in "ORG-10" (trimmed) I get an entity with "ORG-10"
(trimmed) as it's ID.
 
Behavior #2 surprised me! I expected "ORG-10    ", with trailing spaces,
just as it is in the database.
 
As an experiment, I changed my service implementation to use a query on
the key instead of find() and the behavior goes away.  That is, I get
"ORG-10    " as the ID of the Organization regardless of whether the
input was trimmed or not.
 
Am I misunderstanding how it should work?
 
  --Rich
 

Re: EntityManager fetch and query by key behave differently?

Posted by Michael Dick <mi...@gmail.com>.
Hi all,

It seems to me that there are really two scenarios here.

1.) The results from executing a query, or find should contain _the_ values
as returned by JDBC. So for DB2, if a column is returned as 'test     ' we
should store that value in the entity. Ie :
EntityManager em = emf.createEntityManager();
MyEntity e = em.find(MyEntity.class, 'test     ') ;
e.getId().equals('test      ')   // the same would apply to a nonID field.

2.) When an entity is already in the persistence context we should use the
values from in memory (at least until em.refresh is called). Ie.
EntityManager em = emf.createEntityManager();
em.persist(new MyEntity('test');  // no whitespace

MyEntity e = em.find(MyEntity.class, 'test     ') ;  // we know the database
will trim trailing whitespace.
e.getId().equals('test')   // 'test' is still in memory, and we must return
a single instance for each row.

2.b)
em.refresh(e);
e.getId().equals('test     ')  // since we've been told to update with the
latest values from the database this would be a time to add trailing
whitespace.

Of these, it's #2 which concerns me the most. Having two instances which
correspond to the same row can lead to interesting problems (that and I
think I know how to fix it).

I've opened JIRA issue OPENJPA-1501 for #2 in particular, but I'll take a
look at the other aspects too..

Like Milosz mentioned the behavior might be different across databases. In
some 'raw' JDBC tests I noticed that DB2 and Oracle would include the
trailing spaces, but MySQL truncated them. Some further investigation is
probably needed, but ultimately I think we'd want to rely on the JDBC driver
returning the 'right' value..

-mike

On Thu, Jan 28, 2010 at 12:10 PM, Miłosz Tylenda <mt...@o2.pl> wrote:

> Hi All,
>
> That interesting mismatch reminds me of a few side-notes.
>
> 1. I remember Oracle treating "test2    " as equal to "test2" in WHERE
> clause when using constants. In other words, if a table has a row with value
> "test2    " the following query will return it:
>
> SELECT * FROM table WHERE column='test2';
>
> On the other hand, if the query is written using ?-parameter
>
> SELECT * FROM table WHERE column=?;
>
> you have to right-pad the parameter ("test2    "). Otherwise, no match will
> be found. It was at Oracle 8/9 times, I don't know if anything changed
> recently.
>
> 2. I am wondering whether the JDBC spec defines the behaviour of
> ?-parameters in our context.
>
> 3. OpenJPA uses ?-parameters in this case, even if the JPQL query uses
> constants.
>
> 4. I can imagine a similar problem can arise if one tries to use
> approximate types as primary keys, e.g. float. However, this
> approximate-problem is clearer since the JPA spec explicitly discourages
> such usages.
>
> Regards,
> Milosz
>
>
> > Heath,
> >
> > Thanks for confirming the behavior, I was beginning to think I was
> > imagining things :^)
> >
> > From my limited experience, here's my two cents: make find() and JPQL
> > work as much like SQL as possible (for this case).
> >
> > To me this means:
> >
> > 1. Always return a padded string when loading an attribute mapped to a
> > fixed-width column from the database [1].
> >
> > 2. A query result should never return the query parameter used, only the
> > actual value of the column.
> >
> > In SQL, if you query:
> >
> >         SELECT e.ent1pk FROM Ent1 e WHERE e.ent1pk = 'test2'
> >
> > you expect to get 'test2    ' if that is what is stored in the ent1pk
> > column.  Yes, the COMPARISON of 'test2' and 'test2   ' is a little
> > complicated, as you describe.  However, fundamentally, 'test2' is NOT
> > STORED IN THE DATABASE.  So why would JPA return it to me in ANY query
> > result?
> >
> > If we were to follow this approach, then all entities Ent1 would have
> > padded primary keys ('test2    ').  The existing default equals()
> > implementation then would work as desired.  And, because of that, I
> > assume the JPA machinery would no longer load two entities for one
> > database key.
> >
> > I realize that this approach only treats the query problem.  You also
> > describe a problem upon entity creation -- where you create an Ent1 with
> > ent1pk='test2' and later query it other ways and find it's not equal to
> > itself. You wrote:
> >
> >       // The type of the Primary Key column here is CHAR(10)
> >       Ent1 ent1 = new Ent1("test2");
> >       em.persist(ent1);
> >       em.flush();
> >       em.refresh(ent1);
> >
> > Extending the above "like SQL" rule, one COULD expect ent1's primary key
> > value to change from 'test2' to 'test2      ', perhaps after persist or
> > upon refresh.  In SQL inserting 'test2' into a fixed-width column will
> > pad the value [1].
> >
> > I see this as an unavoidable artifact of the SQL layer.  So, I'm willing
> > to put up with this one.  JPA, by it's nature as a mapping layer, is
> > bound to be a slightly leaky abstraction.  The only alternative I can
> > see is to throw an exception if the ent1pk field is not the correct
> > (database) length.  I'm not sure that's better though.
> >
> >   --Rich
> >
> > [1] My knowledge of SQL is limited to practical experience, primarily
> > with DB2.  I'm not sure if these behaviors are actually in the SQL
> > specification or not.
> >
> >
> >
> > -----Original Message-----
> > From: Heath Thomann [mailto:jpaheath@gmail.com]
> > Sent: Tuesday, January 26, 2010 12:47 PM
> > To: users@openjpa.apache.org
> > Subject: Re: EntityManager fetch and query by key behave differently?
> >
> > Richard,
> >     Ironically, I'm working a customer problem where similar behavior is
> > exhibited.  I have findings similar to yours, and then some.  I'd like
> > to detail my findings, but ultimately we have a couple bugs with the use
> > of Strings as primary keys when mapped to CHARs or VARCHARs.  I will
> > focus only on CHARs but I have noticed some similar results with
> > VARCHARs.  To set the stage, lets take this entity:
> >
> > @Entity
> > public class Ent1 {
> >     @Id
> >     @Column(columnDefinition="CHAR(10)")
> >     private String ent1pk;
> >    ..........
> >     public String toString() {
> >         return this.getClass().getName() + "[PK: " +
> > this.getPk().toString()
> > + "]";
> >     }
> >
> >
> > With this entity, lets perform the following tests:
> >
> >
> > EntityManager em;
> > em.getTransaction().begin();
> >
> > // The type of the Primary Key column here is CHAR(10)
> > Ent1 ent1 = new Ent1("test2");
> > em.persist(ent1);
> > em.flush();
> > em.refresh(ent1);
> >
> > // case a: perform a query using a key padded with spaces...note the
> > trailing spaces in //the Primary key in the second query:
> > Ent1 ent1FromQuery_noPads = (Ent1) em.createQuery("SELECT e FROM Ent1 e
> > WHERE e.ent1pk = 'test2'").getSingleResult();
> > Ent1 ent1FromQuery_wPads = (Ent1) em.createQuery("SELECT e FROM Ent1 e
> > WHERE
> > e.ent1pk = 'test2        '").getSingleResult();
> >
> > System.out.println("ent1 = " + ent1);
> > System.out.println("ent1FromQuery_noPads = " + ent1FromQuery_noPads);
> > System.out.println("ent1 == ent1FromQuery_noPads : " + (ent1 ==
> > ent1FromQuery_noPads));
> > System.out.println("ent1.equals(ent1FromQuery_noPads) : " +
> > (ent1.equals(ent1FromQuery_noPads)));
> > System.out.println("ent1FromQuery_wPads = " + ent1FromQuery_wPads);
> > System.out.println("ent1 == ent1FromQuery_wPads : " + (ent1 ==
> > ent1FromQuery_wPads));
> > System.out.println("ent1.equals(ent1FromQuery_wPads) : " +
> > (ent1.equals(ent1FromQuery_wPads)));
> >
> >
> > // case b: perform a find using a key padded with spaces...note the
> > trailing spaces in //the Primary key in the second find:
> > Ent1 ent1FromFind_noPads = em.find(Ent1.class, "test2");
> > Ent1 ent1FromFind_wPads = em.find(Ent1.class, "test2        ");
> >
> > System.out.println("ent1FromFind_noPads = " + ent1FromFind_noPads);
> > System.out.println("ent1 == ent1FromFind_noPads : " + (ent1 ==
> > ent1FromFind_noPads));
> > System.out.println("ent1.equals(ent1FromFind_noPads) : " +
> > (ent1.equals(ent1FromFind_noPads)));
> > System.out.println("ent1FromFind_wPads = " + ent1FromFind_wPads);
> > System.out.println("ent1 == ent1FromFind_wPads : " + (ent1 ==
> > ent1FromFind_wPads));
> > System.out.println("ent1.equals(ent1FromFind_wPads) : " +
> > (ent1.equals(ent1FromFind_wPads)));
> >
> >
> > With this test, let me show the output (println's) and then we can
> > discuss what is wrong with the output and why.....the output is as
> > follows:
> >
> >
> > 1)  ent1 = Ent1[PK: test2]
> > 2)  ent1FromQuery_noPads = Ent1[PK: test2     ]
> > 3)  ent1 == ent1FromQuery_noPads : false
> > 4)  ent1.equals(ent1FromQuery_noPads) : false
> > 5)  ent1FromQuery_wPads = Ent1[PK: test2     ]
> > 6)  ent1 == ent1FromQuery_wPads : false
> > 7)  ent1.equals(ent1FromQuery_wPads) : false
> >
> > 8)  ent1FromFind_noPads = Ent1[PK: test2]
> > 9)  ent1 == ent1FromFind_noPads : true
> > 10) ent1.equals(ent1FromFind_noPads) : true
> > 11) ent1FromFind_wPads = Ent1[PK: test2        ]
> > 12) ent1 == ent1FromFind_wPads : false
> > 13) ent1.equals(ent1FromFind_wPads) : false
> >
> >
> > A study of this output shows a couple problems:
> > 1) Entity Ent1 was created with a PK of "test2" which contains no
> > pads/trailing spaces.  However, as we can see in line 2 and 5, the PK of
> > the entities returned from a 'find', regardless of whether we use a
> > 'padded' key or not, yields an entity with a padded PK.  That is, the PK
> > contains 5 pads, for a total of 10 chars (i.e. the column length).
> > Lines 3, 4, 6, and 7 yield incorrect results given the PK is padded.
> > Essentially, after the query we have two entities which correspond to
> > the same row (which is not correct per JPA spec).
> > 2) When we do a 'find', rather than a 'query', we get different results.
> > When we do a query using a PK with no pads, we get back an Entity which
> > contains a PK with no pads.  When we do a query with a PK which has been
> > padded with 8 pads/spaces, the entity we get back contains a PK with 8
> > pads as well.  In this case, line 9 and 10 shows correct results, but
> > line 12 and
> > 13 shows incorrect results.
> >
> > In speaking to my colleague Mike Dick, he found the following in the SQL
> > spec which would indicate that we have a bug, and why (paragraph 3.b
> > being the pertinent info):
> >
> > 3) The comparison of two character strings is determined as follows:
> > a) Let CS be the collation as determined by Subclause 9.13, "Collation
> > determination", for the declared types of the two character strings.
> > b) If the length in characters of X is not equal to the length in
> > characters of Y, then the shorter string is effectively replaced, for
> > the purposes of comparison, with a copy of itself that has been extended
> > to the length of the longer string by concatenation on the right of one
> > or more pad characters, where the pad character is chosen based on CS.
> > If CS has the NO PAD characteristic, then the pad character is an
> > implementation-dependent character different from any character in the
> > character set of X and Y that collates less than any string under CS.
> > Otherwise, the pad character is a <space>.
> > c) The result of the comparison of X and Y is given by the collation CS.
> > d) Depending on the collation, two strings may compare as equal even if
> > they are of different lengths or contain different sequences of
> > characters. When any of the operations MAX, MIN, and DISTINCT reference
> > a grouping column, and the UNION, EXCEPT, and INTERSECT operators refer
> > to character strings, the specific value selected by these operations
> > from a set of such equal values is implementationdependent.
> >
> >
> > This indicates that the database should treat 'test' == 'test ' ==
> > 'test             '.   By extension we'd assume that JPA should also
> > treat
> > them as equal.
> >
> > I'd love to hear from the communtiy on this matter to determine what the
> > correct behaviour should be.
> >
> > Thanks,
> >
> > Heath
> >
> >
> > On Mon, Dec 14, 2009 at 11:17 AM, Landers, Richard
> > <Ri...@ct.gov>wrote:
> >
> > > Hello everyone,
> > >
> > > Can anyone clarify my understanding:
> > >
> > > I have an entity ("Organization") that I often fetch by it's key (a
> > > String).
> > >
> > > In DB2 (v9.5 community) the ID field is a CHAR(10).
> > >
> > > I've been using EntityManger.find(Class, String) to do the job for
> > > months.
> > >
> > > I was prettying-up URLs in my web layer when I noticed this behavior:
> > >
> > > (1) When I pass in: "ORG-10    " (note the trailing spaces) I get an
> > > entity with "ORG-10    " (with trailing spaces) as it's ID.
> > >
> > > (2) When I pass in "ORG-10" (trimmed) I get an entity with "ORG-10"
> > > (trimmed) as it's ID.
> > >
> > > Behavior #2 surprised me! I expected "ORG-10    ", with trailing
> > spaces,
> > > just as it is in the database.
> > >
> > > As an experiment, I changed my service implementation to use a query
> > > on the key instead of find() and the behavior goes away.  That is, I
> > get
> > > "ORG-10    " as the ID of the Organization regardless of whether the
> > > input was trimmed or not.
> > >
> > > Am I misunderstanding how it should work?
> > >
> > >  --Rich
> > >
> > >
> >
> >
> >
> >
> >
>

RE: EntityManager fetch and query by key behave differently?

Posted by Miłosz Tylenda <mt...@o2.pl>.
Hi All,

That interesting mismatch reminds me of a few side-notes.

1. I remember Oracle treating "test2    " as equal to "test2" in WHERE clause when using constants. In other words, if a table has a row with value "test2    " the following query will return it:

SELECT * FROM table WHERE column='test2';

On the other hand, if the query is written using ?-parameter

SELECT * FROM table WHERE column=?;

you have to right-pad the parameter ("test2    "). Otherwise, no match will be found. It was at Oracle 8/9 times, I don't know if anything changed recently.

2. I am wondering whether the JDBC spec defines the behaviour of ?-parameters in our context.

3. OpenJPA uses ?-parameters in this case, even if the JPQL query uses constants.

4. I can imagine a similar problem can arise if one tries to use approximate types as primary keys, e.g. float. However, this approximate-problem is clearer since the JPA spec explicitly discourages such usages.

Regards,
Milosz


> Heath,
> 
> Thanks for confirming the behavior, I was beginning to think I was
> imagining things :^)
> 
> From my limited experience, here's my two cents: make find() and JPQL
> work as much like SQL as possible (for this case).
> 
> To me this means:
> 
> 1. Always return a padded string when loading an attribute mapped to a
> fixed-width column from the database [1].
> 
> 2. A query result should never return the query parameter used, only the
> actual value of the column.
> 
> In SQL, if you query:
> 
>         SELECT e.ent1pk FROM Ent1 e WHERE e.ent1pk = 'test2'
> 
> you expect to get 'test2    ' if that is what is stored in the ent1pk
> column.  Yes, the COMPARISON of 'test2' and 'test2   ' is a little
> complicated, as you describe.  However, fundamentally, 'test2' is NOT
> STORED IN THE DATABASE.  So why would JPA return it to me in ANY query
> result?
> 
> If we were to follow this approach, then all entities Ent1 would have
> padded primary keys ('test2    ').  The existing default equals()
> implementation then would work as desired.  And, because of that, I
> assume the JPA machinery would no longer load two entities for one
> database key.
> 
> I realize that this approach only treats the query problem.  You also
> describe a problem upon entity creation -- where you create an Ent1 with
> ent1pk='test2' and later query it other ways and find it's not equal to
> itself. You wrote:
> 
> 	// The type of the Primary Key column here is CHAR(10)
> 	Ent1 ent1 = new Ent1("test2");
> 	em.persist(ent1);
> 	em.flush();
> 	em.refresh(ent1);
> 
> Extending the above "like SQL" rule, one COULD expect ent1's primary key
> value to change from 'test2' to 'test2      ', perhaps after persist or
> upon refresh.  In SQL inserting 'test2' into a fixed-width column will
> pad the value [1].
> 
> I see this as an unavoidable artifact of the SQL layer.  So, I'm willing
> to put up with this one.  JPA, by it's nature as a mapping layer, is
> bound to be a slightly leaky abstraction.  The only alternative I can
> see is to throw an exception if the ent1pk field is not the correct
> (database) length.  I'm not sure that's better though.
> 
>   --Rich
> 
> [1] My knowledge of SQL is limited to practical experience, primarily
> with DB2.  I'm not sure if these behaviors are actually in the SQL
> specification or not.
> 
> 
> 
> -----Original Message-----
> From: Heath Thomann [mailto:jpaheath@gmail.com]
> Sent: Tuesday, January 26, 2010 12:47 PM
> To: users@openjpa.apache.org
> Subject: Re: EntityManager fetch and query by key behave differently?
> 
> Richard,
>     Ironically, I'm working a customer problem where similar behavior is
> exhibited.  I have findings similar to yours, and then some.  I'd like
> to detail my findings, but ultimately we have a couple bugs with the use
> of Strings as primary keys when mapped to CHARs or VARCHARs.  I will
> focus only on CHARs but I have noticed some similar results with
> VARCHARs.  To set the stage, lets take this entity:
> 
> @Entity
> public class Ent1 {
>     @Id
>     @Column(columnDefinition="CHAR(10)")
>     private String ent1pk;
>    ..........
>     public String toString() {
>         return this.getClass().getName() + "[PK: " +
> this.getPk().toString()
> + "]";
>     }
> 
> 
> With this entity, lets perform the following tests:
> 
> 
> EntityManager em;
> em.getTransaction().begin();
> 
> // The type of the Primary Key column here is CHAR(10)
> Ent1 ent1 = new Ent1("test2");
> em.persist(ent1);
> em.flush();
> em.refresh(ent1);
> 
> // case a: perform a query using a key padded with spaces...note the
> trailing spaces in //the Primary key in the second query:
> Ent1 ent1FromQuery_noPads = (Ent1) em.createQuery("SELECT e FROM Ent1 e
> WHERE e.ent1pk = 'test2'").getSingleResult();
> Ent1 ent1FromQuery_wPads = (Ent1) em.createQuery("SELECT e FROM Ent1 e
> WHERE
> e.ent1pk = 'test2        '").getSingleResult();
> 
> System.out.println("ent1 = " + ent1);
> System.out.println("ent1FromQuery_noPads = " + ent1FromQuery_noPads);
> System.out.println("ent1 == ent1FromQuery_noPads : " + (ent1 ==
> ent1FromQuery_noPads));
> System.out.println("ent1.equals(ent1FromQuery_noPads) : " +
> (ent1.equals(ent1FromQuery_noPads)));
> System.out.println("ent1FromQuery_wPads = " + ent1FromQuery_wPads);
> System.out.println("ent1 == ent1FromQuery_wPads : " + (ent1 ==
> ent1FromQuery_wPads));
> System.out.println("ent1.equals(ent1FromQuery_wPads) : " +
> (ent1.equals(ent1FromQuery_wPads)));
> 
> 
> // case b: perform a find using a key padded with spaces...note the
> trailing spaces in //the Primary key in the second find:
> Ent1 ent1FromFind_noPads = em.find(Ent1.class, "test2");
> Ent1 ent1FromFind_wPads = em.find(Ent1.class, "test2        ");
> 
> System.out.println("ent1FromFind_noPads = " + ent1FromFind_noPads);
> System.out.println("ent1 == ent1FromFind_noPads : " + (ent1 ==
> ent1FromFind_noPads));
> System.out.println("ent1.equals(ent1FromFind_noPads) : " +
> (ent1.equals(ent1FromFind_noPads)));
> System.out.println("ent1FromFind_wPads = " + ent1FromFind_wPads);
> System.out.println("ent1 == ent1FromFind_wPads : " + (ent1 ==
> ent1FromFind_wPads));
> System.out.println("ent1.equals(ent1FromFind_wPads) : " +
> (ent1.equals(ent1FromFind_wPads)));
> 
> 
> With this test, let me show the output (println's) and then we can
> discuss what is wrong with the output and why.....the output is as
> follows:
> 
> 
> 1)  ent1 = Ent1[PK: test2]
> 2)  ent1FromQuery_noPads = Ent1[PK: test2     ]
> 3)  ent1 == ent1FromQuery_noPads : false
> 4)  ent1.equals(ent1FromQuery_noPads) : false
> 5)  ent1FromQuery_wPads = Ent1[PK: test2     ]
> 6)  ent1 == ent1FromQuery_wPads : false
> 7)  ent1.equals(ent1FromQuery_wPads) : false
> 
> 8)  ent1FromFind_noPads = Ent1[PK: test2]
> 9)  ent1 == ent1FromFind_noPads : true
> 10) ent1.equals(ent1FromFind_noPads) : true
> 11) ent1FromFind_wPads = Ent1[PK: test2        ]
> 12) ent1 == ent1FromFind_wPads : false
> 13) ent1.equals(ent1FromFind_wPads) : false
> 
> 
> A study of this output shows a couple problems:
> 1) Entity Ent1 was created with a PK of "test2" which contains no
> pads/trailing spaces.  However, as we can see in line 2 and 5, the PK of
> the entities returned from a 'find', regardless of whether we use a
> 'padded' key or not, yields an entity with a padded PK.  That is, the PK
> contains 5 pads, for a total of 10 chars (i.e. the column length).
> Lines 3, 4, 6, and 7 yield incorrect results given the PK is padded.
> Essentially, after the query we have two entities which correspond to
> the same row (which is not correct per JPA spec).
> 2) When we do a 'find', rather than a 'query', we get different results.
> When we do a query using a PK with no pads, we get back an Entity which
> contains a PK with no pads.  When we do a query with a PK which has been
> padded with 8 pads/spaces, the entity we get back contains a PK with 8
> pads as well.  In this case, line 9 and 10 shows correct results, but
> line 12 and
> 13 shows incorrect results.
> 
> In speaking to my colleague Mike Dick, he found the following in the SQL
> spec which would indicate that we have a bug, and why (paragraph 3.b
> being the pertinent info):
> 
> 3) The comparison of two character strings is determined as follows:
> a) Let CS be the collation as determined by Subclause 9.13, "Collation
> determination", for the declared types of the two character strings.
> b) If the length in characters of X is not equal to the length in
> characters of Y, then the shorter string is effectively replaced, for
> the purposes of comparison, with a copy of itself that has been extended
> to the length of the longer string by concatenation on the right of one
> or more pad characters, where the pad character is chosen based on CS.
> If CS has the NO PAD characteristic, then the pad character is an
> implementation-dependent character different from any character in the
> character set of X and Y that collates less than any string under CS.
> Otherwise, the pad character is a <space>.
> c) The result of the comparison of X and Y is given by the collation CS.
> d) Depending on the collation, two strings may compare as equal even if
> they are of different lengths or contain different sequences of
> characters. When any of the operations MAX, MIN, and DISTINCT reference
> a grouping column, and the UNION, EXCEPT, and INTERSECT operators refer
> to character strings, the specific value selected by these operations
> from a set of such equal values is implementationdependent.
> 
> 
> This indicates that the database should treat 'test' == 'test ' ==
> 'test             '.   By extension we'd assume that JPA should also
> treat
> them as equal.
> 
> I'd love to hear from the communtiy on this matter to determine what the
> correct behaviour should be.
> 
> Thanks,
> 
> Heath
> 
> 
> On Mon, Dec 14, 2009 at 11:17 AM, Landers, Richard
> <Ri...@ct.gov>wrote:
> 
> > Hello everyone,
> >
> > Can anyone clarify my understanding:
> >
> > I have an entity ("Organization") that I often fetch by it's key (a
> > String).
> >
> > In DB2 (v9.5 community) the ID field is a CHAR(10).
> >
> > I've been using EntityManger.find(Class, String) to do the job for
> > months.
> >
> > I was prettying-up URLs in my web layer when I noticed this behavior:
> >
> > (1) When I pass in: "ORG-10    " (note the trailing spaces) I get an
> > entity with "ORG-10    " (with trailing spaces) as it's ID.
> >
> > (2) When I pass in "ORG-10" (trimmed) I get an entity with "ORG-10"
> > (trimmed) as it's ID.
> >
> > Behavior #2 surprised me! I expected "ORG-10    ", with trailing
> spaces,
> > just as it is in the database.
> >
> > As an experiment, I changed my service implementation to use a query
> > on the key instead of find() and the behavior goes away.  That is, I
> get
> > "ORG-10    " as the ID of the Organization regardless of whether the
> > input was trimmed or not.
> >
> > Am I misunderstanding how it should work?
> >
> >  --Rich
> >
> >
> 
> 
> 
> 
> 

RE: EntityManager fetch and query by key behave differently?

Posted by "Landers, Richard" <Ri...@ct.gov>.
Heath,

Thanks for confirming the behavior, I was beginning to think I was
imagining things :^)

>From my limited experience, here's my two cents: make find() and JPQL
work as much like SQL as possible (for this case).

To me this means:

1. Always return a padded string when loading an attribute mapped to a
fixed-width column from the database [1].

2. A query result should never return the query parameter used, only the
actual value of the column.

In SQL, if you query:

        SELECT e.ent1pk FROM Ent1 e WHERE e.ent1pk = 'test2'

you expect to get 'test2    ' if that is what is stored in the ent1pk
column.  Yes, the COMPARISON of 'test2' and 'test2   ' is a little
complicated, as you describe.  However, fundamentally, 'test2' is NOT
STORED IN THE DATABASE.  So why would JPA return it to me in ANY query
result?

If we were to follow this approach, then all entities Ent1 would have
padded primary keys ('test2    ').  The existing default equals()
implementation then would work as desired.  And, because of that, I
assume the JPA machinery would no longer load two entities for one
database key.

I realize that this approach only treats the query problem.  You also
describe a problem upon entity creation -- where you create an Ent1 with
ent1pk='test2' and later query it other ways and find it's not equal to
itself. You wrote:

	// The type of the Primary Key column here is CHAR(10)
	Ent1 ent1 = new Ent1("test2");
	em.persist(ent1);
	em.flush();
	em.refresh(ent1);

Extending the above "like SQL" rule, one COULD expect ent1's primary key
value to change from 'test2' to 'test2      ', perhaps after persist or
upon refresh.  In SQL inserting 'test2' into a fixed-width column will
pad the value [1].

I see this as an unavoidable artifact of the SQL layer.  So, I'm willing
to put up with this one.  JPA, by it's nature as a mapping layer, is
bound to be a slightly leaky abstraction.  The only alternative I can
see is to throw an exception if the ent1pk field is not the correct
(database) length.  I'm not sure that's better though.

  --Rich

[1] My knowledge of SQL is limited to practical experience, primarily
with DB2.  I'm not sure if these behaviors are actually in the SQL
specification or not.



-----Original Message-----
From: Heath Thomann [mailto:jpaheath@gmail.com]
Sent: Tuesday, January 26, 2010 12:47 PM
To: users@openjpa.apache.org
Subject: Re: EntityManager fetch and query by key behave differently?

Richard,
    Ironically, I'm working a customer problem where similar behavior is
exhibited.  I have findings similar to yours, and then some.  I'd like
to detail my findings, but ultimately we have a couple bugs with the use
of Strings as primary keys when mapped to CHARs or VARCHARs.  I will
focus only on CHARs but I have noticed some similar results with
VARCHARs.  To set the stage, lets take this entity:

@Entity
public class Ent1 {
    @Id
    @Column(columnDefinition="CHAR(10)")
    private String ent1pk;
   ..........
    public String toString() {
        return this.getClass().getName() + "[PK: " +
this.getPk().toString()
+ "]";
    }


With this entity, lets perform the following tests:


EntityManager em;
em.getTransaction().begin();

// The type of the Primary Key column here is CHAR(10)
Ent1 ent1 = new Ent1("test2");
em.persist(ent1);
em.flush();
em.refresh(ent1);

// case a: perform a query using a key padded with spaces...note the
trailing spaces in //the Primary key in the second query:
Ent1 ent1FromQuery_noPads = (Ent1) em.createQuery("SELECT e FROM Ent1 e
WHERE e.ent1pk = 'test2'").getSingleResult();
Ent1 ent1FromQuery_wPads = (Ent1) em.createQuery("SELECT e FROM Ent1 e
WHERE
e.ent1pk = 'test2        '").getSingleResult();

System.out.println("ent1 = " + ent1);
System.out.println("ent1FromQuery_noPads = " + ent1FromQuery_noPads);
System.out.println("ent1 == ent1FromQuery_noPads : " + (ent1 ==
ent1FromQuery_noPads));
System.out.println("ent1.equals(ent1FromQuery_noPads) : " +
(ent1.equals(ent1FromQuery_noPads)));
System.out.println("ent1FromQuery_wPads = " + ent1FromQuery_wPads);
System.out.println("ent1 == ent1FromQuery_wPads : " + (ent1 ==
ent1FromQuery_wPads));
System.out.println("ent1.equals(ent1FromQuery_wPads) : " +
(ent1.equals(ent1FromQuery_wPads)));


// case b: perform a find using a key padded with spaces...note the
trailing spaces in //the Primary key in the second find:
Ent1 ent1FromFind_noPads = em.find(Ent1.class, "test2");
Ent1 ent1FromFind_wPads = em.find(Ent1.class, "test2        ");

System.out.println("ent1FromFind_noPads = " + ent1FromFind_noPads);
System.out.println("ent1 == ent1FromFind_noPads : " + (ent1 ==
ent1FromFind_noPads));
System.out.println("ent1.equals(ent1FromFind_noPads) : " +
(ent1.equals(ent1FromFind_noPads)));
System.out.println("ent1FromFind_wPads = " + ent1FromFind_wPads);
System.out.println("ent1 == ent1FromFind_wPads : " + (ent1 ==
ent1FromFind_wPads));
System.out.println("ent1.equals(ent1FromFind_wPads) : " +
(ent1.equals(ent1FromFind_wPads)));


With this test, let me show the output (println's) and then we can
discuss what is wrong with the output and why.....the output is as
follows:


1)  ent1 = Ent1[PK: test2]
2)  ent1FromQuery_noPads = Ent1[PK: test2     ]
3)  ent1 == ent1FromQuery_noPads : false
4)  ent1.equals(ent1FromQuery_noPads) : false
5)  ent1FromQuery_wPads = Ent1[PK: test2     ]
6)  ent1 == ent1FromQuery_wPads : false
7)  ent1.equals(ent1FromQuery_wPads) : false

8)  ent1FromFind_noPads = Ent1[PK: test2]
9)  ent1 == ent1FromFind_noPads : true
10) ent1.equals(ent1FromFind_noPads) : true
11) ent1FromFind_wPads = Ent1[PK: test2        ]
12) ent1 == ent1FromFind_wPads : false
13) ent1.equals(ent1FromFind_wPads) : false


A study of this output shows a couple problems:
1) Entity Ent1 was created with a PK of "test2" which contains no
pads/trailing spaces.  However, as we can see in line 2 and 5, the PK of
the entities returned from a 'find', regardless of whether we use a
'padded' key or not, yields an entity with a padded PK.  That is, the PK
contains 5 pads, for a total of 10 chars (i.e. the column length).
Lines 3, 4, 6, and 7 yield incorrect results given the PK is padded.
Essentially, after the query we have two entities which correspond to
the same row (which is not correct per JPA spec).
2) When we do a 'find', rather than a 'query', we get different results.
When we do a query using a PK with no pads, we get back an Entity which
contains a PK with no pads.  When we do a query with a PK which has been
padded with 8 pads/spaces, the entity we get back contains a PK with 8
pads as well.  In this case, line 9 and 10 shows correct results, but
line 12 and
13 shows incorrect results.

In speaking to my colleague Mike Dick, he found the following in the SQL
spec which would indicate that we have a bug, and why (paragraph 3.b
being the pertinent info):

3) The comparison of two character strings is determined as follows:
a) Let CS be the collation as determined by Subclause 9.13, "Collation
determination", for the declared types of the two character strings.
b) If the length in characters of X is not equal to the length in
characters of Y, then the shorter string is effectively replaced, for
the purposes of comparison, with a copy of itself that has been extended
to the length of the longer string by concatenation on the right of one
or more pad characters, where the pad character is chosen based on CS.
If CS has the NO PAD characteristic, then the pad character is an
implementation-dependent character different from any character in the
character set of X and Y that collates less than any string under CS.
Otherwise, the pad character is a <space>.
c) The result of the comparison of X and Y is given by the collation CS.
d) Depending on the collation, two strings may compare as equal even if
they are of different lengths or contain different sequences of
characters. When any of the operations MAX, MIN, and DISTINCT reference
a grouping column, and the UNION, EXCEPT, and INTERSECT operators refer
to character strings, the specific value selected by these operations
from a set of such equal values is implementationdependent.


This indicates that the database should treat 'test' == 'test ' ==
'test             '.   By extension we'd assume that JPA should also
treat
them as equal.

I'd love to hear from the communtiy on this matter to determine what the
correct behaviour should be.

Thanks,

Heath


On Mon, Dec 14, 2009 at 11:17 AM, Landers, Richard
<Ri...@ct.gov>wrote:

> Hello everyone,
>
> Can anyone clarify my understanding:
>
> I have an entity ("Organization") that I often fetch by it's key (a
> String).
>
> In DB2 (v9.5 community) the ID field is a CHAR(10).
>
> I've been using EntityManger.find(Class, String) to do the job for
> months.
>
> I was prettying-up URLs in my web layer when I noticed this behavior:
>
> (1) When I pass in: "ORG-10    " (note the trailing spaces) I get an
> entity with "ORG-10    " (with trailing spaces) as it's ID.
>
> (2) When I pass in "ORG-10" (trimmed) I get an entity with "ORG-10"
> (trimmed) as it's ID.
>
> Behavior #2 surprised me! I expected "ORG-10    ", with trailing
spaces,
> just as it is in the database.
>
> As an experiment, I changed my service implementation to use a query
> on the key instead of find() and the behavior goes away.  That is, I
get
> "ORG-10    " as the ID of the Organization regardless of whether the
> input was trimmed or not.
>
> Am I misunderstanding how it should work?
>
>  --Rich
>
>





Re: EntityManager fetch and query by key behave differently?

Posted by Heath Thomann <jp...@gmail.com>.
Richard,
    Ironically, I'm working a customer problem where similar behavior is
exhibited.  I have findings similar to yours, and then some.  I'd like to
detail my findings, but ultimately we have a couple bugs with the use of
Strings as primary keys when mapped to CHARs or VARCHARs.  I will focus only
on CHARs but I have noticed some similar results with VARCHARs.  To set the
stage, lets take this entity:

@Entity
public class Ent1 {
    @Id
    @Column(columnDefinition="CHAR(10)")
    private String ent1pk;
   ..........
    public String toString() {
        return this.getClass().getName() + "[PK: " + this.getPk().toString()
+ "]";
    }


With this entity, lets perform the following tests:


EntityManager em;
em.getTransaction().begin();

// The type of the Primary Key column here is CHAR(10)
Ent1 ent1 = new Ent1("test2");
em.persist(ent1);
em.flush();
em.refresh(ent1);

// case a: perform a query using a key padded with spaces...note the
trailing spaces in
//the Primary key in the second query:
Ent1 ent1FromQuery_noPads = (Ent1) em.createQuery("SELECT e FROM Ent1 e
WHERE e.ent1pk = 'test2'").getSingleResult();
Ent1 ent1FromQuery_wPads = (Ent1) em.createQuery("SELECT e FROM Ent1 e WHERE
e.ent1pk = 'test2        '").getSingleResult();

System.out.println("ent1 = " + ent1);
System.out.println("ent1FromQuery_noPads = " + ent1FromQuery_noPads);
System.out.println("ent1 == ent1FromQuery_noPads : " + (ent1 ==
ent1FromQuery_noPads));
System.out.println("ent1.equals(ent1FromQuery_noPads) : " +
(ent1.equals(ent1FromQuery_noPads)));
System.out.println("ent1FromQuery_wPads = " + ent1FromQuery_wPads);
System.out.println("ent1 == ent1FromQuery_wPads : " + (ent1 ==
ent1FromQuery_wPads));
System.out.println("ent1.equals(ent1FromQuery_wPads) : " +
(ent1.equals(ent1FromQuery_wPads)));


// case b: perform a find using a key padded with spaces...note the trailing
spaces in
//the Primary key in the second find:
Ent1 ent1FromFind_noPads = em.find(Ent1.class, "test2");
Ent1 ent1FromFind_wPads = em.find(Ent1.class, "test2        ");

System.out.println("ent1FromFind_noPads = " + ent1FromFind_noPads);
System.out.println("ent1 == ent1FromFind_noPads : " + (ent1 ==
ent1FromFind_noPads));
System.out.println("ent1.equals(ent1FromFind_noPads) : " +
(ent1.equals(ent1FromFind_noPads)));
System.out.println("ent1FromFind_wPads = " + ent1FromFind_wPads);
System.out.println("ent1 == ent1FromFind_wPads : " + (ent1 ==
ent1FromFind_wPads));
System.out.println("ent1.equals(ent1FromFind_wPads) : " +
(ent1.equals(ent1FromFind_wPads)));


With this test, let me show the output (println's) and then we can discuss
what is wrong with the output and why.....the output is as follows:


1)  ent1 = Ent1[PK: test2]
2)  ent1FromQuery_noPads = Ent1[PK: test2     ]
3)  ent1 == ent1FromQuery_noPads : false
4)  ent1.equals(ent1FromQuery_noPads) : false
5)  ent1FromQuery_wPads = Ent1[PK: test2     ]
6)  ent1 == ent1FromQuery_wPads : false
7)  ent1.equals(ent1FromQuery_wPads) : false

8)  ent1FromFind_noPads = Ent1[PK: test2]
9)  ent1 == ent1FromFind_noPads : true
10) ent1.equals(ent1FromFind_noPads) : true
11) ent1FromFind_wPads = Ent1[PK: test2        ]
12) ent1 == ent1FromFind_wPads : false
13) ent1.equals(ent1FromFind_wPads) : false


A study of this output shows a couple problems:
1) Entity Ent1 was created with a PK of "test2" which contains no
pads/trailing spaces.  However, as we can see in line 2 and 5, the PK of the
entities returned from a 'find', regardless of whether we use a 'padded' key
or not, yields an entity with a padded PK.  That is, the PK contains 5 pads,
for a total of 10 chars (i.e. the column length).  Lines 3, 4, 6, and 7
yield incorrect results given the PK is padded.  Essentially, after the
query we have two entities which correspond to the same row (which is not
correct per JPA spec).
2) When we do a 'find', rather than a 'query', we get different results.
When we do a query using a PK with no pads, we get back an Entity which
contains a PK with no pads.  When we do a query with a PK which has been
padded with 8 pads/spaces, the entity we get back contains a PK with 8 pads
as well.  In this case, line 9 and 10 shows correct results, but line 12 and
13 shows incorrect results.

In speaking to my colleague Mike Dick, he found the following in the SQL
spec which would indicate that we have a bug, and why
(paragraph 3.b being the pertinent info):

3) The comparison of two character strings is determined as follows:
a) Let CS be the collation as determined by Subclause 9.13, “Collation
determination”, for the declared
types of the two character strings.
b) If the length in characters of X is not equal to the length in characters
of Y, then the shorter string is
effectively replaced, for the purposes of comparison, with a copy of itself
that has been extended to
the length of the longer string by concatenation on the right of one or more
pad characters, where the
pad character is chosen based on CS. If CS has the NO PAD characteristic,
then the pad character is
an implementation-dependent character different from any character in the
character set of X and Y
that collates less than any string under CS. Otherwise, the pad character is
a <space>.
c) The result of the comparison of X and Y is given by the collation CS.
d) Depending on the collation, two strings may compare as equal even if they
are of different lengths or
contain different sequences of characters. When any of the operations MAX,
MIN, and DISTINCT
reference a grouping column, and the UNION, EXCEPT, and INTERSECT operators
refer to character
strings, the specific value selected by these operations from a set of such
equal values is implementationdependent.


This indicates that the database should treat 'test' == 'test ' ==
'test             '.   By extension we'd assume that JPA should also treat
them as equal.

I'd love to hear from the communtiy on this matter to determine what the
correct behaviour should be.

Thanks,

Heath


On Mon, Dec 14, 2009 at 11:17 AM, Landers, Richard
<Ri...@ct.gov>wrote:

> Hello everyone,
>
> Can anyone clarify my understanding:
>
> I have an entity ("Organization") that I often fetch by it's key (a
> String).
>
> In DB2 (v9.5 community) the ID field is a CHAR(10).
>
> I've been using EntityManger.find(Class, String) to do the job for
> months.
>
> I was prettying-up URLs in my web layer when I noticed this behavior:
>
> (1) When I pass in: "ORG-10    " (note the trailing spaces) I get an
> entity with "ORG-10    " (with trailing spaces) as it's ID.
>
> (2) When I pass in "ORG-10" (trimmed) I get an entity with "ORG-10"
> (trimmed) as it's ID.
>
> Behavior #2 surprised me! I expected "ORG-10    ", with trailing spaces,
> just as it is in the database.
>
> As an experiment, I changed my service implementation to use a query on
> the key instead of find() and the behavior goes away.  That is, I get
> "ORG-10    " as the ID of the Organization regardless of whether the
> input was trimmed or not.
>
> Am I misunderstanding how it should work?
>
>  --Rich
>
>