You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cayenne.apache.org by Hugi Thordarson <hu...@karlmenn.is> on 2011/10/25 12:54:47 UTC

Can I specify join type for relationships in the model?

Good morning all.

The subject says it all, really: Cayenne seems to default to inner joins for relationships. Is it possible for me to set the join type for relationships in the model, so I don't have to specify the join type for every query? (I always want outer joins)

Cheers,
- hugi

Re: Can I specify join type for relationships in the model?

Posted by Andrus Adamchik <an...@objectstyle.org>.
On Oct 27, 2011, at 7:03 PM, Durchholz, Joachim wrote:
>  I was assuming that your mention of "relationships is always done via an INNER join" includes references expressed as path expressions.


This was about getting a related object for a given object (for which Cayenne builds a special internal query), and not about SelectQuery qualifiers that by themselves don't retrieve the related objects. 

Andrus

RE: Can I specify join type for relationships in the model?

Posted by "Durchholz, Joachim" <Jo...@hennig-fahrzeugteile.de>.
> On Oct 27, 2011, at 1:02 PM, Durchholz, Joachim wrote:
> 
>> Hmm... that means when the parent exists but not the grandparent, accessing the grandparent does not get the parent into the object cache.
> 
> Not sure I understand what scenario you are talking about. o.getRelated().getRelated() would certainly get the intermediate object in the cache.

When accessing/filtering via path expressions.
(See http://cayenne.apache.org/doc/path-expressions.html )
I was assuming that your mention of "relationships is always done via an INNER join" includes references expressed as path expressions.

Regards,
Jo

Re: Can I specify join type for relationships in the model?

Posted by Andrus Adamchik <an...@objectstyle.org>.
On Oct 27, 2011, at 1:02 PM, Durchholz, Joachim wrote:

> Hmm... that means when the parent exists but not the grandparent, accessing the grandparent does not get the parent into the object cache.

Not sure I understand what scenario you are talking about. o.getRelated().getRelated() would certainly get the intermediate object in the cache.

Andrus


RE: Can I specify join type for relationships in the model?

Posted by "Durchholz, Joachim" <Jo...@hennig-fahrzeugteile.de>.
>> I had thought that OF COURSE you need to specify optionality on relationships, but it seems to be far less a case than I thought.
> 
> Actually if we talk about "optionality" flag, it is something different from the relationship join type per Hugi's original email. Not sure he was talking about the same thing.

You're right. I was just making the connection because that's the one situation where the difference between outer and inner joins makes sense at the relationship level.
For fetch optimization, specifying a join type does not make that much sense.
Hibernate does have it, on the grounds that the application programmer can specify the default strategy when there's no more specific reason to use the other fetch type. I'm not sure that that is a good idea though; flipping that switch might change the application's performance in unexpected ways.

>> For automatic SELECTs by Cayenne, I see one use case: if the 
>> application holds a record that's a grandchild, and you access the 
>> grandparent, Cayenne could optimize the outer join between parent and 
>> grandparent table into an inner join if it knew that the to-1 
>> relationship from parent to grandparent is mandatory. (Inner joins 
>> give the query planner of the database more options. More options 
>> means better plans, but maybe also more fruitless optimization 
>> planning, so it's a mixed blessing - also, grandchild-to-grandparent 
>> selects tend to not be bulk transfers, so this use case is a rather 
>> weak argument.)
> 
> Strictly reading the relationships is always done via an INNER join.

Hmm... that means when the parent exists but not the grandparent, accessing the grandparent does not get the parent into the object cache.
I'd have expected Cayenne to act as if it first loaded the parent, then the grandparent, so the parent would be in memory anyway.
I'm not sure how that difference would be actually noticeable. And it can always be fixed by a per-query option, so it's not a too big deal anyway.

>> Then there's 1:1 relationships. If one side is optional, then it's 
>> just a case of 1:N with the additional constraint N<=1. If both
>> sides are mandatory, we're in trouble - Cayenne does not know in
>> which order to do the updates (and most databases will complain,
>> since they do their consistency checks immediately instead of at
>> end-of-transaction, so you can't INSERT into the left table because
>> there is no record in the right table yet, and vice versa).
> 
> we do have a checkbox "To Dep PK" in the join that hints Cayenne which 1:1 side is "primary" and which is "dependent". So this situation is handled correctly.

I wouldn't have made that connection if you hadn't mentioned that. "To Dep PK" says "related to primary keys, and directions", but I wouldn't know what it actually does. Nor would I know to look there if I needed to establish a dependency direction.

> But yeah 1:1, and more generally PK:PK relationships, even if this is a relationship between parts of a compound PK, are a case when we don't know the optionality, and have to assume the worst case. E.g. lazy faulting uses optionality information to decide whether to return a HOLLOW object from unresolved relationship without a query, or whether the fetch is required to determine if the target is not null. Somewhat of an edge case (you are reading the relationship, but don't care to resolve the resulting HOLLOW object... but may be good for NULL checking or indirect access to an FK via related object PK).

Resolving that is actually relatively easy - retrieve a PK field via an outer join, and assume the target isn't there if that field is NULL.
Well, for databases that actually allow NULLs in PKs, you'd probably want to use a SELECT COUNT(*) or IF EXISTS subquery instead.
Anyway, what I mean to say is that you don't need to actually resolve the object to know whether it exists (and if the related table has non-PK fields, this check will be faster than simply retrieving the object).

> We don't do much (anything at all) in the area of analyzing native exceptions. So I can't comment on specifics, but it would be cool to have such ability.

I'm not sure that it's worth the effort. You'd be hunting a moving target, since exception behaviour of databases can (and sometime does) change depending on database release, mode of operation, isolation level, and probably phase of moon.

Regards,
Jo

Re: Can I specify join type for relationships in the model?

Posted by Andrus Adamchik <an...@objectstyle.org>.
On Oct 26, 2011, at 1:48 PM, Durchholz, Joachim wrote:

> I had thought that OF COURSE you need to specify optionality on relationships, but it seems to be far less a case than I thought.


Actually if we talk about "optionality" flag, it is something different from the relationship join type per Hugi's original email. Not sure he was talking about the same thing.

Cayenne can actually guess optionality of some relationships (e.g. to-one based on non-NULL FK). Those that can't be guessed, are assumed to be optional. And it does use the optionality information in many places (fault resolution strategies come to mind, see below). So the question here whether we should bother with an extra mapping property for those few remaining cases (that are handled correctly, but perhaps can be optimized a bit further).

> For automatic SELECTs by Cayenne, I see one use case: if the application holds a record that's a grandchild, and you access the grandparent, Cayenne could optimize the outer join between parent and grandparent table into an inner join if it knew that the to-1 relationship from parent to grandparent is mandatory. (Inner joins give the query planner of the database more options. More options means better plans, but maybe also more fruitless optimization planning, so it's a mixed blessing - also, grandchild-to-grandparent selects tend to not be bulk transfers, so this use case is a rather weak argument.)

Strictly reading the relationships is always done via an INNER join. But *prefetching* them in a query that selects relationship root is using OUTER. So this may be about optimizing prefetching (in Cayenne terms anyways).

> Then there's 1:1 relationships. If one side is optional, then it's just a case of 1:N with the additional constraint N<=1. If both sides are mandatory, we're in trouble - Cayenne does not know in which order to do the updates (and most databases will complain, since they do their consistency checks immediately instead of at end-of-transaction,
> so you can't INSERT into the left table because there is no record in the right table yet, and vice versa).

we do have a checkbox "To Dep PK" in the join that hints Cayenne which 1:1 side is "primary" and which is "dependent". So this situation is handled correctly.

But yeah 1:1, and more generally PK:PK relationships, even if this is a relationship between parts of a compound PK, are a case when we don't know the optionality, and have to assume the worst case. E.g. lazy faulting uses optionality information to decide whether to return a HOLLOW object from unresolved relationship without a query, or whether the fetch is required to determine if the target is not null. Somewhat of an edge case (you are reading the relationship, but don't care to resolve the resulting HOLLOW object... but may be good for NULL checking or indirect access to an FK via related object PK).

> The other point for optionality information is that Cayenne could do consistency checks before writing to the database.
> (I don't know whether Cayenne does.)
> If Cayenne knows, it can avoid getting hit by SQLExceptions from constraint violations. This could help Cayenne if it does exception analysis, such as would be needed to activate the various workarounds needed as Oracle reports rollback segment problems.
> On the con side, for the application, it doesn't matter much to the application whether it gets hit by an SQLException or a Cayenne-generated consistency exception.

We don't do much (anything at all) in the area of analyzing native exceptions. So I can't comment on specifics, but it would be cool to have such ability.

Cheers,
Andrus


RE: Can I specify join type for relationships in the model?

Posted by "Durchholz, Joachim" <Jo...@hennig-fahrzeugteile.de>.
> I don't think we had a discussion. We can have it now.

Nice - the "that's by design" bit made me fear it's not negotiable anymore.

> In this particular case my thinking is that relationship
> itself does not have any particular "join semantics"...
> It is always a match between two keys. If you just need
> to find objects related to a given object, using outer
> join never makes sense.

Hmm... that's interesting.

I had thought that OF COURSE you need to specify optionality on relationships, but it seems to be far less a case than I thought.

Let's enumerate use cases where the optionality of a relationship's side might be relevant.

For automatic SELECTs by Cayenne, I see one use case: if the application holds a record that's a grandchild, and you access the grandparent, Cayenne could optimize the outer join between parent and grandparent table into an inner join if it knew that the to-1 relationship from parent to grandparent is mandatory. (Inner joins give the query planner of the database more options. More options means better plans, but maybe also more fruitless optimization planning, so it's a mixed blessing - also, grandchild-to-grandparent selects tend to not be bulk transfers, so this use case is a rather weak argument.)

What's the case for write access? I.e. INSERT/UPDATE/DELETE?

Here, optionality information is very relevant for determining action order: you delete the optional record before you delete the nonoptional one, and you create the nonoptional record before creating the optional one.
However, for a 1:N relationship, the N side is always optional, so there is a valid ordering available. Telling Cayenne that the 1 side is optional as well would give it more freedom to reorder writes, but it already has a valid order and doesn't need an additional one.
Then there's 1:1 relationships. If one side is optional, then it's just a case of 1:N with the additional constraint N<=1. If both sides are mandatory, we're in trouble - Cayenne does not know in which order to do the updates (and most databases will complain, since they do their consistency checks immediately instead of at end-of-transaction, so you can't INSERT into the left table because there is no record in the right table yet, and vice versa).

The other point for optionality information is that Cayenne could do consistency checks before writing to the database.
(I don't know whether Cayenne does.)
If Cayenne knows, it can avoid getting hit by SQLExceptions from constraint violations. This could help Cayenne if it does exception analysis, such as would be needed to activate the various workarounds needed as Oracle reports rollback segment problems.
On the con side, for the application, it doesn't matter much to the application whether it gets hit by an SQLException or a Cayenne-generated consistency exception.

> Where it does make sense is in
> qualifiers. But this makes it a property of a qualifier,
> not the relationship.

It makes it a property of the qualifier, but per se, that does not preclude it as a property of the relationship.

> In generally it is often hard to find such single place
> for many ORM properties. There's always a concern that
> anything beyond basic DB mapping is really a property of
> the execution context (e.g. it is different per-application,
> per-session, per-query, etc. - a good example is entity
> callbacks/listeners adding behavior to the mapping). So
> this often becomes a question of whether "one size fits all".

I fully agree with that point of view.

Regards,
Jo

Re: Can I specify join type for relationships in the model?

Posted by Daniel Scheibe <ds...@googlemail.com>.
Hi Hugi,

not sure this is related to your issue but i really like the way Cayenne 
gives me control of the join type inside a path expression as mentioned 
here:

http://cayenne.apache.org/doc/path-expressions.html

Of course this is not handled on the model level tho.

Cheers,
Daniel

Am 25.10.2011 14:03, schrieb Andrus Adamchik:
> I don't think we had a discussion. We can have it now.
>
> In this particular case my thinking is that relationship itself does not have any particular "join semantics"... It is always a match between two keys. If you just need to find objects related to a given object, using outer join never makes sense. Where it does make sense is in qualifiers. But this makes it a property of a qualifier, not the relationship.
>
> In generally it is often hard to find such single place for many ORM properties. There's always a concern that anything beyond basic DB mapping is really a property of the execution context (e.g. it is different per-application, per-session, per-query, etc. - a good example is entity callbacks/listeners adding behavior to the mapping). So this often becomes a question of whether "one size fits all".
>
> Andrus
>
>
> On Oct 25, 2011, at 2:40 PM, Hugi Thordarson wrote:
>> Thank you Andrus.
>> Has there ever been discussion of adding the ability to specify join types when modeling relationships, or would that be at odds with design intentions? It seems like a useful feature.
>>
>> Cheers,
>> - hugi
>>
>>
>>
>> On 25.10.2011, at 11:23, Andrus Adamchik wrote:
>>
>>> Yes, in Cayenne join type is a property of a path in expression or query, not a relationship. This is by design.
>>>
>>> Andrus
>>>
>>> On Oct 25, 2011, at 1:54 PM, Hugi Thordarson wrote:
>>>
>>>> Good morning all.
>>>>
>>>> The subject says it all, really: Cayenne seems to default to inner joins for relationships. Is it possible for me to set the join type for relationships in the model, so I don't have to specify the join type for every query? (I always want outer joins)
>>>>
>>>> Cheers,
>>>> - hugi
>>>
>>
>>
>

Re: Can I specify join type for relationships in the model?

Posted by Andrus Adamchik <an...@objectstyle.org>.
I don't think we had a discussion. We can have it now.

In this particular case my thinking is that relationship itself does not have any particular "join semantics"... It is always a match between two keys. If you just need to find objects related to a given object, using outer join never makes sense. Where it does make sense is in qualifiers. But this makes it a property of a qualifier, not the relationship.

In generally it is often hard to find such single place for many ORM properties. There's always a concern that anything beyond basic DB mapping is really a property of the execution context (e.g. it is different per-application, per-session, per-query, etc. - a good example is entity callbacks/listeners adding behavior to the mapping). So this often becomes a question of whether "one size fits all".

Andrus


On Oct 25, 2011, at 2:40 PM, Hugi Thordarson wrote:
> Thank you Andrus.
> Has there ever been discussion of adding the ability to specify join types when modeling relationships, or would that be at odds with design intentions? It seems like a useful feature.
> 
> Cheers,
> - hugi
> 
> 
> 
> On 25.10.2011, at 11:23, Andrus Adamchik wrote:
> 
>> Yes, in Cayenne join type is a property of a path in expression or query, not a relationship. This is by design.
>> 
>> Andrus
>> 
>> On Oct 25, 2011, at 1:54 PM, Hugi Thordarson wrote:
>> 
>>> Good morning all.
>>> 
>>> The subject says it all, really: Cayenne seems to default to inner joins for relationships. Is it possible for me to set the join type for relationships in the model, so I don't have to specify the join type for every query? (I always want outer joins)
>>> 
>>> Cheers,
>>> - hugi
>> 
> 
> 


Re: Can I specify join type for relationships in the model?

Posted by Hugi Thordarson <hu...@karlmenn.is>.
Thank you Andrus.
Has there ever been discussion of adding the ability to specify join types when modeling relationships, or would that be at odds with design intentions? It seems like a useful feature.

Cheers,
- hugi



On 25.10.2011, at 11:23, Andrus Adamchik wrote:

> Yes, in Cayenne join type is a property of a path in expression or query, not a relationship. This is by design.
> 
> Andrus
> 
> On Oct 25, 2011, at 1:54 PM, Hugi Thordarson wrote:
> 
>> Good morning all.
>> 
>> The subject says it all, really: Cayenne seems to default to inner joins for relationships. Is it possible for me to set the join type for relationships in the model, so I don't have to specify the join type for every query? (I always want outer joins)
>> 
>> Cheers,
>> - hugi
> 


Re: Can I specify join type for relationships in the model?

Posted by Andrus Adamchik <an...@objectstyle.org>.
Yes, in Cayenne join type is a property of a path in expression or query, not a relationship. This is by design.

Andrus

On Oct 25, 2011, at 1:54 PM, Hugi Thordarson wrote:

> Good morning all.
> 
> The subject says it all, really: Cayenne seems to default to inner joins for relationships. Is it possible for me to set the join type for relationships in the model, so I don't have to specify the join type for every query? (I always want outer joins)
> 
> Cheers,
> - hugi