You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@empire-db.apache.org by Alain Becam <Al...@embl.de> on 2011/09/29 13:38:26 UTC
Unexpected behaviour while using notIn
Hello all,
I have been trying to write a simple query with a notIn:
DBCommand cmdFindPubForPerson = dbPerson.createCommand();
cmdFindPubForPerson.select(publication4Group.C_PUBLICATION_ID);
cmdFindPubForPerson.where(publication4Group.C_GROUP_ID.is(group_Id));
DBQuery queryForAssociatedPub = new DBQuery(cmdFindCollForPerson);
DBCommand cmdFindPub = dbPerson.createCommand();
cmdFindPub.selectDistinct();
cmdFindPub.select(publicationsFromDb.getColumns());
cmdFindPub.where(publicationsFromDb.C_ID.notIn(queryForAssociatedColl));
And was expecting this query:
SELECT DISTINCT t11.CREATEDBY, ...
FROM PUBLICATIONS t11
WHERE t11.ID NOT IN ((SELECT DISTINCT t13.PUBLICATION_ID
FROM PUBLICATION4GROUP t13
WHERE t13.GROUP_ID=27))
but got that:
SELECT DISTINCT t11.CREATEDBY, ...
FROM PUBLICATIONS t11, PUBLICATION4GROUP t13
WHERE t11.ID NOT IN ((SELECT DISTINCT t13.PUBLICATION_ID
FROM PUBLICATION4GROUP t13
WHERE t13.GROUP_ID=27))
which is wrong and pretty slow (well should be).
Is there a reason I get the PUBLICATION4GROUP table in the main query
from? Might be a very stupid question, I wonder if I am not missing
something pretty simple here :)
//Alain
Re: Unexpected behaviour while using notIn
Posted by Alain Becam <Al...@embl.de>.
Hi Rainer,
Thank you for your quick answer. So there was indeed something
simple I missed ;)
And it's not optimized too, I did a couple of "notIn" to quickly get
some queries running, and it was not too slow for testing it. But your
too solutions should work, including the one I really shouldn't have
missed with the left join :)
Keep up the good work! I saved a lot of time using EmpireDB!
//Alain
On 29.09.2011 21:42, Rainer Döbele wrote:
> Hi Alain,
>
> the answer to your problem is simple:
> You just made it a little more complicated than it needs to be.
>
> Simply user your sub-query command directly with the notIn() expression like this:
>
> cmdFindPub.where(publicationsFromDb.C_ID.notIn( cmdFindPubForPerson ))
>
> the DBQuery object is not required here at all.
>
> However, you might want to consider a more effective statement by joining the two queries rather than using the query inside a constraint. This is where DBQuery object comes in.
> In that case you would replace
>
> cmdFindPub.where(publicationsFromDb.C_ID.notIn(...))
>
> by
>
> DBColumnExpr Q_PUB_ID = queryForAssociatedPub.findQueryColumn(publication4Group.C_PUBLICATION_ID);
> cmdFindPub.join(publicationsFromDb.C_ID, Q_PUB_ID, joinType.Left);
> cmdFindPub.where(Q_PUB_ID.is(null));
>
> This should give you something like this:
>
> SELECT DISTINCT t11.CREATEDBY, ...
> FROM PUBLICATIONS t11
> LEFT JOIN (SELECT DISTINCT t13.PUBLICATION_ID
> FROM PUBLICATION4GROUP t13
> WHERE t13.GROUP_ID=27) q on q.PUBLICATION_ID=t11.ID
> WHERE q.PUBLICATION_ID is null
>
> ...which performs way better than your approach.
>
> Question is however, whether you need the subquery at all. Why not write:
>
> SELECT DISTINCT t11.CREATEDBY, ...
> FROM PUBLICATIONS t11
> LEFT JOIN PUBLICATION4GROUP t13 on t13.PUBLICATION_ID=t11.ID and t13.GROUP_ID=27
> WHERE t13.PUBLICATION_ID is null
>
> (haven't really tried it though)
>
> The behavior you described is by design and is required for the second example to work.
>
> Regards
> Rainer
>
>
>> from: Alain Becam [mailto:Alain.Becam@embl.de]
>> to: empire-db-user@incubator.apache.org
>> re: Unexpected behaviour while using notIn
>>
>> Hello all,
>> I have been trying to write a simple query with a notIn:
>>
>>
>>
>> DBCommand cmdFindPubForPerson = dbPerson.createCommand();
>>
>> cmdFindPubForPerson.select(publication4Group.C_PUBLICATION_ID);
>>
>> cmdFindPubForPerson.where(publication4Group.C_GROUP_ID.is(group_Id));
>>
>> DBQuery queryForAssociatedPub = new DBQuery(cmdFindCollForPerson);
>>
>>
>> DBCommand cmdFindPub = dbPerson.createCommand();
>>
>> cmdFindPub.selectDistinct();
>> cmdFindPub.select(publicationsFromDb.getColumns());
>> cmdFindPub.where(publicationsFromDb.C_ID.notIn(queryForAssociatedColl))
>> ;
>>
>> And was expecting this query:
>>
>> SELECT DISTINCT t11.CREATEDBY, ...
>> FROM PUBLICATIONS t11
>> WHERE t11.ID NOT IN ((SELECT DISTINCT t13.PUBLICATION_ID
>> FROM PUBLICATION4GROUP t13
>> WHERE t13.GROUP_ID=27))
>>
>> but got that:
>>
>> SELECT DISTINCT t11.CREATEDBY, ...
>> FROM PUBLICATIONS t11, PUBLICATION4GROUP t13
>> WHERE t11.ID NOT IN ((SELECT DISTINCT t13.PUBLICATION_ID
>> FROM PUBLICATION4GROUP t13
>> WHERE t13.GROUP_ID=27))
>>
>> which is wrong and pretty slow (well should be).
>>
>> Is there a reason I get the PUBLICATION4GROUP table in the main query
>> from? Might be a very stupid question, I wonder if I am not missing
>> something pretty simple here :)
>>
>> //Alain
--
----------------------------
Alain Becam, PhD
IT Services, EMBL Heidelberg
mailto:Alain.Becam@embl.de
Tel +49 (0) 6221 387 8593
----------------------------
re: Unexpected behaviour while using notIn
Posted by Rainer Döbele <do...@esteam.de>.
Hi Alain,
the answer to your problem is simple:
You just made it a little more complicated than it needs to be.
Simply user your sub-query command directly with the notIn() expression like this:
cmdFindPub.where(publicationsFromDb.C_ID.notIn( cmdFindPubForPerson ))
the DBQuery object is not required here at all.
However, you might want to consider a more effective statement by joining the two queries rather than using the query inside a constraint. This is where DBQuery object comes in.
In that case you would replace
cmdFindPub.where(publicationsFromDb.C_ID.notIn(...))
by
DBColumnExpr Q_PUB_ID = queryForAssociatedPub.findQueryColumn(publication4Group.C_PUBLICATION_ID);
cmdFindPub.join(publicationsFromDb.C_ID, Q_PUB_ID, joinType.Left);
cmdFindPub.where(Q_PUB_ID.is(null));
This should give you something like this:
SELECT DISTINCT t11.CREATEDBY, ...
FROM PUBLICATIONS t11
LEFT JOIN (SELECT DISTINCT t13.PUBLICATION_ID
FROM PUBLICATION4GROUP t13
WHERE t13.GROUP_ID=27) q on q.PUBLICATION_ID=t11.ID
WHERE q.PUBLICATION_ID is null
...which performs way better than your approach.
Question is however, whether you need the subquery at all. Why not write:
SELECT DISTINCT t11.CREATEDBY, ...
FROM PUBLICATIONS t11
LEFT JOIN PUBLICATION4GROUP t13 on t13.PUBLICATION_ID=t11.ID and t13.GROUP_ID=27
WHERE t13.PUBLICATION_ID is null
(haven't really tried it though)
The behavior you described is by design and is required for the second example to work.
Regards
Rainer
> from: Alain Becam [mailto:Alain.Becam@embl.de]
> to: empire-db-user@incubator.apache.org
> re: Unexpected behaviour while using notIn
>
> Hello all,
> I have been trying to write a simple query with a notIn:
>
>
>
> DBCommand cmdFindPubForPerson = dbPerson.createCommand();
>
> cmdFindPubForPerson.select(publication4Group.C_PUBLICATION_ID);
>
> cmdFindPubForPerson.where(publication4Group.C_GROUP_ID.is(group_Id));
>
> DBQuery queryForAssociatedPub = new DBQuery(cmdFindCollForPerson);
>
>
> DBCommand cmdFindPub = dbPerson.createCommand();
>
> cmdFindPub.selectDistinct();
> cmdFindPub.select(publicationsFromDb.getColumns());
> cmdFindPub.where(publicationsFromDb.C_ID.notIn(queryForAssociatedColl))
> ;
>
> And was expecting this query:
>
> SELECT DISTINCT t11.CREATEDBY, ...
> FROM PUBLICATIONS t11
> WHERE t11.ID NOT IN ((SELECT DISTINCT t13.PUBLICATION_ID
> FROM PUBLICATION4GROUP t13
> WHERE t13.GROUP_ID=27))
>
> but got that:
>
> SELECT DISTINCT t11.CREATEDBY, ...
> FROM PUBLICATIONS t11, PUBLICATION4GROUP t13
> WHERE t11.ID NOT IN ((SELECT DISTINCT t13.PUBLICATION_ID
> FROM PUBLICATION4GROUP t13
> WHERE t13.GROUP_ID=27))
>
> which is wrong and pretty slow (well should be).
>
> Is there a reason I get the PUBLICATION4GROUP table in the main query
> from? Might be a very stupid question, I wonder if I am not missing
> something pretty simple here :)
>
> //Alain