You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by "Jeff Lichtman (JIRA)" <de...@db.apache.org> on 2006/01/04 06:14:01 UTC

[jira] Commented: (DERBY-781) Materialize union subqueries in select list where possible to avoid creating invariant resultsets many times.

    [ http://issues.apache.org/jira/browse/DERBY-781?page=comments#action_12361709 ] 

Jeff Lichtman commented on DERBY-781:
-------------------------------------

I've been thinking about this enhancement request ever since it was reported. Something didn't seem quite right to me, but it wasn't until now that I was able to put my finger on it.

The report identifies a real problem (performance with a union as the inner table of a join) and proposes a solution that would work (materialization). I think, though, that the proposed solution focuses in the wrong place. The materialization should happen as a result of a join strategy, not as part of the logic associated with unions.

There are cases where materializing a union would cause a query to run slower. Materialization requires the creation of a temporary conglomerate and the inserting of rows into the conglomerate, so it should be done only if the savings are greater than the costs. Since materialization can make things either faster or slower depending on circumstances, the decision as to whether to materialize should be done in the optimizer.

Also, there are other types of  result sets that could benefit from materialization - for example, INTERSECT, joins, aggregates, etc. Any of these could end up on the right side of a join through the use of  table subqueries (i.e. SELECT statements in the FROM list of the outer query). I don't think we want to re-implement the materialization logic in all of these cases. I suppose the logic could be pushed into a parent class, but I think even that would be putting it in the wrong place.

If you think about it, we already have a join strategy that materializes the inner result  set, i.e. hash join.  I would expect the optimizer to at least consider this strategy for the example given in this enhancement request. We should check whether the optimizer is making the correct decision about hash join in this case before implementing materialization logic specific to unions.


> Materialize union subqueries in select list where possible to avoid creating invariant resultsets many times.
> -------------------------------------------------------------------------------------------------------------
>
>          Key: DERBY-781
>          URL: http://issues.apache.org/jira/browse/DERBY-781
>      Project: Derby
>         Type: Improvement
>   Components: SQL
>     Versions: 10.1.1.0, 10.2.0.0
>  Environment: generic
>     Reporter: Satheesh Bandaram

>
> Derby's handling of union subqueries in from list can be improved by materializing invariant resultsets once, rather than creating them many times.
> For example:
> create view V1 as select i, j from T1 union select i,j from T2;
> create view V2 as select a,b from T3 union select a,b from T4;
> insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5);
> For a query like select * from V1, V2 where V1.j = V2.b and V1.i in (1,2,3,4,5), it is possible the resultset for V2 is created 5 times. (assuming V2 is choosen as the the inner table) This can be very costly if the underlying selects can take long time and also may perform union many times.
> Enhance materialization logic in setOperatorNode.java. It currently returns FALSE always.
> public boolean performMaterialization(JBitSet outerTables)
> 		throws StandardException
> {
> 	// RESOLVE - just say no to materialization right now - should be a cost based decision
> 	return false;
> 	/* Actual materialization, if appropriate, will be placed by our parent PRN.
> 	 * This is because PRN might have a join condition to apply.  (Materialization
> 	 * can only occur before that.
> 	 */
> 	//return true;
> } 

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira