You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@empire-db.apache.org by Alain Becam <Al...@embl.de> on 2011/09/29 13:38:26 UTC

Unexpected behaviour while using notIn

Hello all,
     I have been trying to write a simple query with a notIn:


			
DBCommand cmdFindPubForPerson = dbPerson.createCommand();
			cmdFindPubForPerson.select(publication4Group.C_PUBLICATION_ID);		 
cmdFindPubForPerson.where(publication4Group.C_GROUP_ID.is(group_Id));
	
DBQuery queryForAssociatedPub = new DBQuery(cmdFindCollForPerson);
			
	
DBCommand cmdFindPub = dbPerson.createCommand();
	
cmdFindPub.selectDistinct();
cmdFindPub.select(publicationsFromDb.getColumns());	 
cmdFindPub.where(publicationsFromDb.C_ID.notIn(queryForAssociatedColl));

And was expecting this query:

SELECT DISTINCT t11.CREATEDBY, ...
FROM PUBLICATIONS t11
WHERE t11.ID NOT IN ((SELECT DISTINCT t13.PUBLICATION_ID
FROM PUBLICATION4GROUP t13
WHERE t13.GROUP_ID=27))

but got that:

SELECT DISTINCT t11.CREATEDBY, ...
FROM PUBLICATIONS t11, PUBLICATION4GROUP t13
WHERE t11.ID NOT IN ((SELECT DISTINCT t13.PUBLICATION_ID
FROM PUBLICATION4GROUP t13
WHERE t13.GROUP_ID=27))

which is wrong and pretty slow (well should be).

Is there a reason I get the PUBLICATION4GROUP table in the main query 
from? Might be a very stupid question, I wonder if I am not missing 
something pretty simple here :)

//Alain

Re: Unexpected behaviour while using notIn

Posted by Alain Becam <Al...@embl.de>.
Hi Rainer,
     Thank you for your quick answer. So there was indeed something 
simple I missed ;)

And it's not optimized too, I did a couple of "notIn" to quickly get 
some queries running, and it was not too slow for testing it. But your 
too solutions should work, including the one I really shouldn't have 
missed with the left join :)

Keep up the good work! I saved a lot of time using EmpireDB!

//Alain

On 29.09.2011 21:42, Rainer Döbele wrote:
> Hi Alain,
>
> the answer to your problem is simple:
> You just made it a little more complicated than it needs to be.
>
> Simply user your sub-query command directly with the notIn() expression like this:
>
> 	cmdFindPub.where(publicationsFromDb.C_ID.notIn( cmdFindPubForPerson ))
>
> the DBQuery object is not required here at all.
>
> However, you might want to consider a more effective statement by joining the two queries rather than using the query inside a constraint. This is where DBQuery object comes in.
> In that case you would replace
>
> 	cmdFindPub.where(publicationsFromDb.C_ID.notIn(...))
>
> by
>
> 	DBColumnExpr Q_PUB_ID = queryForAssociatedPub.findQueryColumn(publication4Group.C_PUBLICATION_ID);
> 	cmdFindPub.join(publicationsFromDb.C_ID, Q_PUB_ID, joinType.Left);
> 	cmdFindPub.where(Q_PUB_ID.is(null));
>
> This should give you something like this:
>
> 	SELECT DISTINCT t11.CREATEDBY, ...
> 	FROM PUBLICATIONS t11
> 	  LEFT JOIN (SELECT DISTINCT t13.PUBLICATION_ID
> 			 FROM PUBLICATION4GROUP t13
> 			 WHERE t13.GROUP_ID=27) q on q.PUBLICATION_ID=t11.ID
> 	WHERE q.PUBLICATION_ID is null
>
> ...which performs way better than your approach.
>
> Question is however, whether you need the subquery at all. Why not write:
>
> 	SELECT DISTINCT t11.CREATEDBY, ...
> 	FROM PUBLICATIONS t11
> 	  LEFT JOIN PUBLICATION4GROUP t13 on t13.PUBLICATION_ID=t11.ID and t13.GROUP_ID=27
> 	WHERE t13.PUBLICATION_ID is null
>
> (haven't really tried it though)
>
> The behavior you described is by design and is required for the second example to work.
>
> Regards
> Rainer
>
>
>> from: Alain Becam [mailto:Alain.Becam@embl.de]
>> to: empire-db-user@incubator.apache.org
>> re: Unexpected behaviour while using notIn
>>
>> Hello all,
>>       I have been trying to write a simple query with a notIn:
>>
>>
>>
>> DBCommand cmdFindPubForPerson = dbPerson.createCommand();
>>
>> 	cmdFindPubForPerson.select(publication4Group.C_PUBLICATION_ID);
>>
>> cmdFindPubForPerson.where(publication4Group.C_GROUP_ID.is(group_Id));
>>
>> DBQuery queryForAssociatedPub = new DBQuery(cmdFindCollForPerson);
>>
>>
>> DBCommand cmdFindPub = dbPerson.createCommand();
>>
>> cmdFindPub.selectDistinct();
>> cmdFindPub.select(publicationsFromDb.getColumns());
>> cmdFindPub.where(publicationsFromDb.C_ID.notIn(queryForAssociatedColl))
>> ;
>>
>> And was expecting this query:
>>
>> SELECT DISTINCT t11.CREATEDBY, ...
>> FROM PUBLICATIONS t11
>> WHERE t11.ID NOT IN ((SELECT DISTINCT t13.PUBLICATION_ID
>> FROM PUBLICATION4GROUP t13
>> WHERE t13.GROUP_ID=27))
>>
>> but got that:
>>
>> SELECT DISTINCT t11.CREATEDBY, ...
>> FROM PUBLICATIONS t11, PUBLICATION4GROUP t13
>> WHERE t11.ID NOT IN ((SELECT DISTINCT t13.PUBLICATION_ID
>> FROM PUBLICATION4GROUP t13
>> WHERE t13.GROUP_ID=27))
>>
>> which is wrong and pretty slow (well should be).
>>
>> Is there a reason I get the PUBLICATION4GROUP table in the main query
>> from? Might be a very stupid question, I wonder if I am not missing
>> something pretty simple here :)
>>
>> //Alain

-- 
----------------------------
       Alain Becam, PhD
IT Services, EMBL Heidelberg
  mailto:Alain.Becam@embl.de
  Tel +49 (0) 6221 387 8593
----------------------------

re: Unexpected behaviour while using notIn

Posted by Rainer Döbele <do...@esteam.de>.
Hi Alain,

the answer to your problem is simple:
You just made it a little more complicated than it needs to be.

Simply user your sub-query command directly with the notIn() expression like this:

	cmdFindPub.where(publicationsFromDb.C_ID.notIn( cmdFindPubForPerson ))

the DBQuery object is not required here at all.

However, you might want to consider a more effective statement by joining the two queries rather than using the query inside a constraint. This is where DBQuery object comes in.
In that case you would replace

	cmdFindPub.where(publicationsFromDb.C_ID.notIn(...))

by 

	DBColumnExpr Q_PUB_ID = queryForAssociatedPub.findQueryColumn(publication4Group.C_PUBLICATION_ID);
	cmdFindPub.join(publicationsFromDb.C_ID, Q_PUB_ID, joinType.Left);
	cmdFindPub.where(Q_PUB_ID.is(null));

This should give you something like this:

	SELECT DISTINCT t11.CREATEDBY, ...
	FROM PUBLICATIONS t11
	  LEFT JOIN (SELECT DISTINCT t13.PUBLICATION_ID
			 FROM PUBLICATION4GROUP t13
			 WHERE t13.GROUP_ID=27) q on q.PUBLICATION_ID=t11.ID
	WHERE q.PUBLICATION_ID is null

...which performs way better than your approach.

Question is however, whether you need the subquery at all. Why not write:

	SELECT DISTINCT t11.CREATEDBY, ...
	FROM PUBLICATIONS t11
	  LEFT JOIN PUBLICATION4GROUP t13 on t13.PUBLICATION_ID=t11.ID and t13.GROUP_ID=27
	WHERE t13.PUBLICATION_ID is null

(haven't really tried it though)

The behavior you described is by design and is required for the second example to work.

Regards
Rainer


> from: Alain Becam [mailto:Alain.Becam@embl.de]
> to: empire-db-user@incubator.apache.org
> re: Unexpected behaviour while using notIn
> 
> Hello all,
>      I have been trying to write a simple query with a notIn:
> 
> 
> 
> DBCommand cmdFindPubForPerson = dbPerson.createCommand();
> 
> 	cmdFindPubForPerson.select(publication4Group.C_PUBLICATION_ID);
> 
> cmdFindPubForPerson.where(publication4Group.C_GROUP_ID.is(group_Id));
> 
> DBQuery queryForAssociatedPub = new DBQuery(cmdFindCollForPerson);
> 
> 
> DBCommand cmdFindPub = dbPerson.createCommand();
> 
> cmdFindPub.selectDistinct();
> cmdFindPub.select(publicationsFromDb.getColumns());
> cmdFindPub.where(publicationsFromDb.C_ID.notIn(queryForAssociatedColl))
> ;
> 
> And was expecting this query:
> 
> SELECT DISTINCT t11.CREATEDBY, ...
> FROM PUBLICATIONS t11
> WHERE t11.ID NOT IN ((SELECT DISTINCT t13.PUBLICATION_ID
> FROM PUBLICATION4GROUP t13
> WHERE t13.GROUP_ID=27))
> 
> but got that:
> 
> SELECT DISTINCT t11.CREATEDBY, ...
> FROM PUBLICATIONS t11, PUBLICATION4GROUP t13
> WHERE t11.ID NOT IN ((SELECT DISTINCT t13.PUBLICATION_ID
> FROM PUBLICATION4GROUP t13
> WHERE t13.GROUP_ID=27))
> 
> which is wrong and pretty slow (well should be).
> 
> Is there a reason I get the PUBLICATION4GROUP table in the main query
> from? Might be a very stupid question, I wonder if I am not missing
> something pretty simple here :)
> 
> //Alain