You are viewing a plain text version of this content. The canonical link for it is here.

Posted to derby-dev@db.apache.org by Jack Klebanoff <kl...@sbcglobal.net> on 2005/02/25 18:44:35 UTC

[PATCH] BackingStoreHashtable

I have attached a patch that causes BackingStoreHashtable to spill to 
disk when it gets large. BackingStoreHashtable is used to implement hash 
joins, DISTINCT, scroll insensitive cursors, and the HAVING clause. The 
unpatched BackingStoreHashtable never spills to disk. This causes Derby 
to sometimes run out of memory. See Jira report Derby-106.

One of the arguments of the BackingStoreHashtable constructor is the 
maximum number of rows to store in memory. If this argument is 
non-negative then the patched BackingStoreHashtable spills to disk when 
more than that number of rows are added to the hash table.

If the max_inmemory_rowcnt argument is negative then 
BackingStoreHashtable decides itself when to spill to disk. It does so 
when its estimate of the size of the hash table (in bytes) grows larger 
than 1% of the total memory when the BackingStoreHashtable was created. 
Currently Derby only constructs BackingStoreHashtables with 
max_inmemory_rowcnt = -1, so this mechanism is always used to decide 
when to spill.

The disk portion of the hash table is handled in class DiskHashtable. 
This does not implement a dynamic hash table data structure. Instead it 
is implemented using an idea of Mike Matrigali. The rows are stored in a 
standard heap conglomerate, also used by standard Derby tables. A Btree 
is used to index the rows by hash code. We cannot index them by their 
actual keys because our Btree implementation limits the length of a key. 
In order to find a record by key DiskHashtable scans the Btree for all 
rows with the same hash code and matches the retrieved rows with the 
target key.

The advantage of this method is that it requires little new code because 
it uses existing heap and btree conglomerate implementations. The 
disadvantage is that it is slower than a dynamic hash table structure. 
We expect that in most cases BackingStoreHashtable will not spill to 
disk, so this trade off seems appropriate.

Issues and Future Work
------------------------

The Derby community may want to consider some follow on work.

Hash join costing should be re-evaluated now that BackingStoreHashtable 
can spill to disk. The optimizer can consider using hash joins on larger 
tables, but it should take the cost of disk access into account. This 
leads into a larger issue of optimizer costing. Our cost numbers were 
derived many years ago by timing various operations on a particular 
machine. That machine is probably no longer available for timing 
BackingStoreHashtable, and it is probably obsolete now. We should 
probably revise all of our costing numbers on a more modern machine.

We may want to implement a real dynamic disk hash structure to improve 
the speed of BackingStoreHashtable when it has spilled to disk. If it 
were faster we could use hash joins more often, potentially improving 
the speed of some Derby joins. Furthermore BackingStoreHashtable is used 
elsewhere. Our assumption that BackingStoreHashtable will seldom spill 
to disk may not be correct.

In my implementation BackingStoreHashtable keeps old rows in memory 
after it decides to spill new rows to disk. Mike Matrigali suggested 
that it should move the old rows to disk to reduce memory usage. This 
comes at the price of slowing down both the time to populate a 
BackingStoreHashtable and the time to access an element. That is why I 
chose not to move old rows to disk.

Jack Klebanoff

Re: [PATCH] BackingStoreHashtable

Posted by Jack Klebanoff <kl...@sbcglobal.net>.

Mike Matrigali wrote:

> Thanks for the reply, I'll work with you to get this committed.  I will
> wait on the change you are working on. I think that is the best short
> term solution, as you point out there is more work later on to improve
> the work you have done. I would appreciate it if at least one other 
> person with experience on the language side take a look at this also.
>
> It has been awhile since I looked at jvm memory stuff, but it use to be
> a problem that totalMemory() would return the memory that the jvm
> currently has, not the amount of memory that it is allowed to have.  So
> if you called it after just starting it might return a very small 
> number, say 1 meg, even if the jvm was started and told to grow to a max
> of 100 meg.  Worse was that the behavior was not consistent across 
> JVM/OS combinations.
>
> This memory issue is a real problem as there are a number of things
> that derby could do faster if it knew it could do the whole thing in
> memory, but once you run out of memory it is hard to recover without
> failing the current operation (and quite possibly other derby threads 
> and in a server environment other non derby threads).
>
> At one point sun was proposing some jvm interfaces so one could tell if
> you were getting "close" to running out of memory - so that applications
> could take action before errors happened.  If such a thing existed then
> something like BackingStoreHashTable could grow in memory more 
> aggressively and then if it noticed the impending problem it could spill
> everything to disk, and free up it's current usage.
>
I have modified my patch so that the optimizer and BackingStoreHashtable 
use the same decision about when a hash table will spill to disk. The 
optimizer calls the JoinStrategy.maxCapacity method to find the maximum 
number of rows that the JoinStrategy can handle in a given number of 
bytes. It rejects the strategy if the estimated row count is larger. 
(Currently the optimizer limits each join to 1M of memory). The 
HashJoinStrategy.maxCapacity method divides the maximum byte count by 
the sum of the size of one row plus the size of a Hashtable entry. The 
NestedLoopJoinStrategy.maxCapacity method always returns 
Interer.MAX_VALUE. The HashJoinStrategy.getScanArgs method passes the 
maximum capacity to the ResultSetFactory.|getHashScanResultSet method, 
so that the actual BackingStoreHashtable will spill to disk when the 
optimizer thought that it would. This means that hash joins will not 
spill to disk unless the inner table has more rows than the optimizer 
estimated.

I also changed the DiskHashtable implementation to pass its 
keepAfterCommit parameter on to the 
TransactionController.openConglomerate method. Previously DiskHashtable 
only used keepAfterCommit to construct the temporaryFlag argument of 
||TransactionController.createConglomerate and always passed "false" as 
the hold argument of ||TransactionController.openConglomerate.

Since I made changes to the optimizer and hash join code generator I 
hope that a Derby language expert can review at least that part of my 
updated patch.

||I have not changed the way that BackingStoreHashtable decides when to 
spill when its max_inmemory_rowcnt parameter is negative. (Only hash 
joins pass a non-negative ||max_inmemory_rowcnt||). As Mike pointed out, 
spilling when the in memory hash table grows larger than 1% of 
Runtime.totalMemory() is not completely satisfactory. The JVM may be 
able to get more memory and totalMemory() is likely to be small soon 
after the JVM starts up. However, I do not know of anything that is 
better. If totalMemory() grows subsequent ||BackingStoreHashtables will 
be able to use more memory. Since ||BackingStoreHashtables are 
temporary, this does not seem so bad to me.|
|
Regards

Jack Klebanoff
|||||

Re: [PATCH] BackingStoreHashtable

Posted by Mike Matrigali <mi...@sbcglobal.net>.

Thanks for the reply, I'll work with you to get this committed.  I will
wait on the change you are working on. I think that is the best short
term solution, as you point out there is more work later on to improve
the work you have done. I would appreciate it if at least one other 
person with experience on the language side take a look at this also.

It has been awhile since I looked at jvm memory stuff, but it use to be
a problem that totalMemory() would return the memory that the jvm
currently has, not the amount of memory that it is allowed to have.  So
if you called it after just starting it might return a very small 
number, say 1 meg, even if the jvm was started and told to grow to a max
of 100 meg.  Worse was that the behavior was not consistent across 
JVM/OS combinations.

This memory issue is a real problem as there are a number of things
that derby could do faster if it knew it could do the whole thing in
memory, but once you run out of memory it is hard to recover without
failing the current operation (and quite possibly other derby threads 
and in a server environment other non derby threads).

At one point sun was proposing some jvm interfaces so one could tell if
you were getting "close" to running out of memory - so that applications
could take action before errors happened.  If such a thing existed then
something like BackingStoreHashTable could grow in memory more 
aggressively and then if it noticed the impending problem it could spill
everything to disk, and free up it's current usage.

Jack Klebanoff wrote:
> Mike Matrigali wrote:
> 
>> Any idea if many more queries will now spill to disk?  I am worried that
>> many more will, and now queries will run slower.  Seems like there 
>> should be a better connection between the optimizer estimate to use
>> in memory hash and what actually is done in memory.  As I understand it
>> now the Optimizer costs hash only as in memory, if it's estimates say
>> it won't fit into memory it chooses a different plan.  I assume the 
>> other plan is more efficient than a hash plan which may have to go to 
>> disk for every probe (even if pages are cached going to latched pages
>> on every probe is a lot more expensive than in memory hash).
>>
>> My guess is no one did the work to set max_inmemory_rowcnt, because the
>> backing store stuff wasn't implemented yet.
>>
>> I have not read the code yet.  Is the 1% of all memory, or 1% of free 
>> memory?
> 
> 
> It is 1% of Runtime.getRuntime().totalMemory().
> 
> Changing the costing is difficult. The FromBaseTable.estimateCost method 
> uses the JoinStrategy|.|multiplyBaseCostByOuterRows method to compute 
> the cost of the join as either the cost of constructing the inner table 
> or the number of rows in the outer table times the cost of constructing 
> the inner table. HashJoinStrategy|.|multiplyBaseCostByOuterRows returns 
> false, so the cost of a hash join is estimated as the cost of 
> constructing the hash table, and independent of the number of rows in 
> the outer table. This is a reasonable approximation if the hash table is 
> in memory, the outer table is not enormous, and the hash function does a 
> good job.
> 
> If some of the hash table is on disk then neither join cost estimate 
> method is very good. Then cost of a hash join is approximately 
> hashtableBuildCost + outerTableRowCount*hashProbeCost, where 
> hashProbeCost is the expected cost of one probe into the hash table. 
> This will be a function of the fraction of inner table rows on disk, and 
> the number of rows on disk.
> 
> The JoinStrategy interface has a method, estimateCost, that looks like 
> it could be made to do the trick, but for some reason 
> FromBaseTable.estimateCost does not use it.
> 
> The estimate of the inner table row count is used to estimate the size 
> of the hash table. If it is large we do not use a hash join at all. If 
> we try to include the disk access cost in our hash join cost estimate we 
> have to use the inner table row count estimate to estimate the fraction 
> of rows on disk. But if the inner table passes the maximum size test 
> this fraction is zero if the row count estimate is correct, or very 
> small if we assume some statistical variation. It doesn't make sense to 
> include the disk hash table probe cost in the join cost estimate unless 
> we also ditch the maximum size restriction on hash joins. This might be 
> a good idea, but I would like to fix BackingStoreHashtable so that 
> neither hash joins, nor DISTINCT, nor scroll insensitive cursors blow up 
> before tackling join costing.
> 
> I think that the right thing to do is for HashJoinStrategy to pass the 
> correct maximum in-memory row count to BackingStoreHashtable so that the 
> two are following the same playbook with respect to spilling. I will 
> work on this.
> 
> Jack
> 
>

Re: [PATCH] BackingStoreHashtable

Posted by Jack Klebanoff <kl...@sbcglobal.net>.

Mike Matrigali wrote:

> Any idea if many more queries will now spill to disk?  I am worried that
> many more will, and now queries will run slower.  Seems like there 
> should be a better connection between the optimizer estimate to use
> in memory hash and what actually is done in memory.  As I understand it
> now the Optimizer costs hash only as in memory, if it's estimates say
> it won't fit into memory it chooses a different plan.  I assume the 
> other plan is more efficient than a hash plan which may have to go to 
> disk for every probe (even if pages are cached going to latched pages
> on every probe is a lot more expensive than in memory hash).
>
> My guess is no one did the work to set max_inmemory_rowcnt, because the
> backing store stuff wasn't implemented yet.
>
> I have not read the code yet.  Is the 1% of all memory, or 1% of free 
> memory?

It is 1% of Runtime.getRuntime().totalMemory().

Changing the costing is difficult. The FromBaseTable.estimateCost method 
uses the JoinStrategy|.|multiplyBaseCostByOuterRows method to compute 
the cost of the join as either the cost of constructing the inner table 
or the number of rows in the outer table times the cost of constructing 
the inner table. HashJoinStrategy|.|multiplyBaseCostByOuterRows returns 
false, so the cost of a hash join is estimated as the cost of 
constructing the hash table, and independent of the number of rows in 
the outer table. This is a reasonable approximation if the hash table is 
in memory, the outer table is not enormous, and the hash function does a 
good job.

If some of the hash table is on disk then neither join cost estimate 
method is very good. Then cost of a hash join is approximately 
hashtableBuildCost + outerTableRowCount*hashProbeCost, where 
hashProbeCost is the expected cost of one probe into the hash table. 
This will be a function of the fraction of inner table rows on disk, and 
the number of rows on disk.

The JoinStrategy interface has a method, estimateCost, that looks like 
it could be made to do the trick, but for some reason 
FromBaseTable.estimateCost does not use it.

The estimate of the inner table row count is used to estimate the size 
of the hash table. If it is large we do not use a hash join at all. If 
we try to include the disk access cost in our hash join cost estimate we 
have to use the inner table row count estimate to estimate the fraction 
of rows on disk. But if the inner table passes the maximum size test 
this fraction is zero if the row count estimate is correct, or very 
small if we assume some statistical variation. It doesn't make sense to 
include the disk hash table probe cost in the join cost estimate unless 
we also ditch the maximum size restriction on hash joins. This might be 
a good idea, but I would like to fix BackingStoreHashtable so that 
neither hash joins, nor DISTINCT, nor scroll insensitive cursors blow up 
before tackling join costing.

I think that the right thing to do is for HashJoinStrategy to pass the 
correct maximum in-memory row count to BackingStoreHashtable so that the 
two are following the same playbook with respect to spilling. I will 
work on this.

Jack

Re: [PATCH] BackingStoreHashtable

Posted by Army <qo...@sbcglobal.net>.

 > Jack Klebanoff wrote:
 >
 > I have modified my patch so that the optimizer and BackingStoreHashtable
 > use the same decision about when a hash table will spill to disk.
 >
 > [snip]
 >
 > Since I made changes to the optimizer and hash join code generator I
 > hope that a Derby language expert can review at least that part of my
 > updated patch.
 >

I'm far from a "Derby language expert", but I did take the time to review the patch and it looks good to me.  I 
successfully applied the patch to my local codeline, then ran the two tests that were included, and both passed.  So not 
only do the changes/additions to the code look good, but the actual patch itself seems to be in good shape, too.

+1,
Army

Re: [PATCH] BackingStoreHashtable

Posted by Mike Matrigali <mi...@sbcglobal.net>.

Any idea if many more queries will now spill to disk?  I am worried that
many more will, and now queries will run slower.  Seems like there 
should be a better connection between the optimizer estimate to use
in memory hash and what actually is done in memory.  As I understand it
now the Optimizer costs hash only as in memory, if it's estimates say
it won't fit into memory it chooses a different plan.  I assume the 
other plan is more efficient than a hash plan which may have to go to 
disk for every probe (even if pages are cached going to latched pages
on every probe is a lot more expensive than in memory hash).

My guess is no one did the work to set max_inmemory_rowcnt, because the
backing store stuff wasn't implemented yet.

I have not read the code yet.  Is the 1% of all memory, or 1% of free 
memory?


Jack Klebanoff wrote:
> I have attached a patch that causes BackingStoreHashtable to spill to 
> disk when it gets large. BackingStoreHashtable is used to implement hash 
> joins, DISTINCT, scroll insensitive cursors, and the HAVING clause. The 
> unpatched BackingStoreHashtable never spills to disk. This causes Derby 
> to sometimes run out of memory. See Jira report Derby-106.
> 
> One of the arguments of the BackingStoreHashtable constructor is the 
> maximum number of rows to store in memory. If this argument is 
> non-negative then the patched BackingStoreHashtable spills to disk when 
> more than that number of rows are added to the hash table.
> 
> If the max_inmemory_rowcnt argument is negative then 
> BackingStoreHashtable decides itself when to spill to disk. It does so 
> when its estimate of the size of the hash table (in bytes) grows larger 
> than 1% of the total memory when the BackingStoreHashtable was created. 
> Currently Derby only constructs BackingStoreHashtables with 
> max_inmemory_rowcnt = -1, so this mechanism is always used to decide 
> when to spill.
> 
> The disk portion of the hash table is handled in class DiskHashtable. 
> This does not implement a dynamic hash table data structure. Instead it 
> is implemented using an idea of Mike Matrigali. The rows are stored in a 
> standard heap conglomerate, also used by standard Derby tables. A Btree 
> is used to index the rows by hash code. We cannot index them by their 
> actual keys because our Btree implementation limits the length of a key. 
> In order to find a record by key DiskHashtable scans the Btree for all 
> rows with the same hash code and matches the retrieved rows with the 
> target key.
> 
> The advantage of this method is that it requires little new code because 
> it uses existing heap and btree conglomerate implementations. The 
> disadvantage is that it is slower than a dynamic hash table structure. 
> We expect that in most cases BackingStoreHashtable will not spill to 
> disk, so this trade off seems appropriate.
> 
> Issues and Future Work
> ------------------------
> 
> The Derby community may want to consider some follow on work.
> 
> Hash join costing should be re-evaluated now that BackingStoreHashtable 
> can spill to disk. The optimizer can consider using hash joins on larger 
> tables, but it should take the cost of disk access into account. This 
> leads into a larger issue of optimizer costing. Our cost numbers were 
> derived many years ago by timing various operations on a particular 
> machine. That machine is probably no longer available for timing 
> BackingStoreHashtable, and it is probably obsolete now. We should 
> probably revise all of our costing numbers on a more modern machine.
> 
> We may want to implement a real dynamic disk hash structure to improve 
> the speed of BackingStoreHashtable when it has spilled to disk. If it 
> were faster we could use hash joins more often, potentially improving 
> the speed of some Derby joins. Furthermore BackingStoreHashtable is used 
> elsewhere. Our assumption that BackingStoreHashtable will seldom spill 
> to disk may not be correct.
> 
> In my implementation BackingStoreHashtable keeps old rows in memory 
> after it decides to spill new rows to disk. Mike Matrigali suggested 
> that it should move the old rows to disk to reduce memory usage. This 
> comes at the price of slowing down both the time to populate a 
> BackingStoreHashtable and the time to access an element. That is why I 
> chose not to move old rows to disk.
> 
> Jack Klebanoff
> 
> 
> ------------------------------------------------------------------------
> 
> Index: java/engine/org/apache/derby/impl/sql/execute/ScrollInsensitiveResultSet.java
> ===================================================================
> --- java/engine/org/apache/derby/impl/sql/execute/ScrollInsensitiveResultSet.java	(revision 155029)
> +++ java/engine/org/apache/derby/impl/sql/execute/ScrollInsensitiveResultSet.java	(working copy)
> @@ -66,7 +66,6 @@
>  
>  
>  	private int							sourceRowWidth;
> -	private TransactionController		tc;
>  
>  	private	  BackingStoreHashtable		ht;
>  	private	  ExecRow					resultRow;
> @@ -87,6 +86,8 @@
>  
>      private GeneratedMethod closeCleanup;
>  
> +    private boolean keepAfterCommit;
> +
>  	/**
>  	 * Constructor for a ScrollInsensitiveResultSet
>  	 *
> @@ -110,6 +111,7 @@
>  			  optimizerEstimatedRowCount, optimizerEstimatedCost);
>  		this.source = source;
>  		this.sourceRowWidth = sourceRowWidth;
> +        keepAfterCommit = activation.getResultSetHoldability();
>  		maxRows = activation.getMaxRows();
>  		if (SanityManager.DEBUG)
>  		{
> @@ -160,7 +162,7 @@
>  		 * We need BackingStoreHashtable to actually go to disk when it doesn't fit.
>  		 * This is a known limitation.
>  		 */
> -		ht = new BackingStoreHashtable(tc,
> +		ht = new BackingStoreHashtable(getTransactionController(),
>  									   null,
>  									   keyCols,
>  									   false,
> @@ -168,7 +170,8 @@
>  									   HashScanResultSet.DEFAULT_MAX_CAPACITY,
>  									   HashScanResultSet.DEFAULT_INITIAL_CAPACITY,
>  									   HashScanResultSet.DEFAULT_MAX_CAPACITY,
> -									   false);
> +									   false,
> +                                       keepAfterCommit);
>  
>  		openTime += getElapsedMillis(beginTime);
>  		setBeforeFirstRow();
> Index: java/engine/org/apache/derby/impl/sql/execute/HashTableResultSet.java
> ===================================================================
> --- java/engine/org/apache/derby/impl/sql/execute/HashTableResultSet.java	(revision 155029)
> +++ java/engine/org/apache/derby/impl/sql/execute/HashTableResultSet.java	(working copy)
> @@ -221,7 +221,8 @@
>  										   maxInMemoryRowCount,
>  										   (int) initialCapacity,
>  										   loadFactor,
> -										   skipNullKeyColumns);
> +										   skipNullKeyColumns,
> +                                           false /* Not kept after a commit */);
>  
>  			if (runTimeStatsOn)
>  			{
> Index: java/engine/org/apache/derby/impl/store/access/BackingStoreHashTableFromScan.java
> ===================================================================
> --- java/engine/org/apache/derby/impl/store/access/BackingStoreHashTableFromScan.java	(revision 155029)
> +++ java/engine/org/apache/derby/impl/store/access/BackingStoreHashTableFromScan.java	(working copy)
> @@ -97,7 +97,8 @@
>              max_inmemory_rowcnt,
>              initialCapacity,
>              loadFactor,
> -			skipNullKeyColumns);
> +			skipNullKeyColumns,
> +            false /* Do not keep the hash table after a commit. */);
>  
>          open_scan =  (ScanManager)
>              tc.openScan(
> Index: java/engine/org/apache/derby/iapi/store/access/DiskHashtable.java
> ===================================================================
> --- java/engine/org/apache/derby/iapi/store/access/DiskHashtable.java	(revision 0)
> +++ java/engine/org/apache/derby/iapi/store/access/DiskHashtable.java	(revision 0)
> @@ -0,0 +1,377 @@
> +/*
> +
> +   Derby - Class org.apache.derby.iapi.store.access.DiskHashtable
> +
> +   Copyright 2005 The Apache Software Foundation or its licensors, as applicable.
> +
> +   Licensed under the Apache License, Version 2.0 (the "License");
> +   you may not use this file except in compliance with the License.
> +   You may obtain a copy of the License at
> +
> +      http://www.apache.org/licenses/LICENSE-2.0
> +
> +   Unless required by applicable law or agreed to in writing, software
> +   distributed under the License is distributed on an "AS IS" BASIS,
> +   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
> +   See the License for the specific language governing permissions and
> +   limitations under the License.
> +
> + */
> +
> +package org.apache.derby.iapi.store.access;
> +
> +import java.util.Enumeration;
> +import java.util.NoSuchElementException;
> +import java.util.Properties;
> +import java.util.Vector;
> +import org.apache.derby.iapi.error.StandardException;
> +import org.apache.derby.iapi.services.io.FormatableBitSet;
> +import org.apache.derby.iapi.types.DataValueDescriptor;
> +import org.apache.derby.iapi.types.SQLInteger;
> +import org.apache.derby.impl.store.access.heap.HeapRowLocation;
> +import org.apache.derby.iapi.types.RowLocation;
> +import org.apache.derby.iapi.services.context.ContextService;
> +import org.apache.derby.iapi.sql.conn.LanguageConnectionContext;
> +
> +/**
> + * This class is used by BackingStoreHashtable when the BackingStoreHashtable must spill to disk.
> + * It implements the methods of a hash table: put, get, remove, elements, however it is not implemented
> + * as a hash table. In order to minimize the amount of unique code it is implemented using a Btree and a heap
> + * conglomerate. The Btree indexes the hash code of the row key. The actual key may be too long for
> + * our Btree implementation.
> + *
> + * Created: Fri Jan 28 13:58:03 2005
> + *
> + * @author <a href="mailto:klebanof@us.ibm.com">Jack Klebanoff</a>
> + * @version 1.0
> + */
> +
> +public class DiskHashtable 
> +{
> +    private final long rowConglomerateId;
> +    private ConglomerateController rowConglomerate;
> +    private final long btreeConglomerateId;
> +    private ConglomerateController btreeConglomerate;
> +    private final DataValueDescriptor[] btreeRow;
> +    private final int[] key_column_numbers;
> +    private final boolean remove_duplicates;
> +    private final TransactionController tc;
> +    private final DataValueDescriptor[] row;
> +    private final DataValueDescriptor[] scanKey = { new SQLInteger()};
> +    private int size;
> +    private boolean keepStatistics;
> +
> +    /**
> +     * Creates a new <code>DiskHashtable</code> instance.
> +     *
> +     * @param tc
> +     * @param template An array of DataValueDescriptors that serves as a template for the rows.
> +     * @param key_column_numbers The indexes of the key columns (0 based)
> +     * @param remove_duplicates If true then rows with duplicate keys are removed
> +     * @param keepAfterCommit If true then the hash table is kept after a commit
> +     */
> +    public DiskHashtable( TransactionController tc,
> +                          DataValueDescriptor[] template,
> +                          int[] key_column_numbers,
> +                          boolean remove_duplicates,
> +                          boolean keepAfterCommit)
> +        throws StandardException
> +    {
> +        this.tc = tc;
> +        this.key_column_numbers = key_column_numbers;
> +        this.remove_duplicates = remove_duplicates;
> +        LanguageConnectionContext lcc = (LanguageConnectionContext)
> +				ContextService.getContextOrNull(LanguageConnectionContext.CONTEXT_ID);
> +        keepStatistics = (lcc != null) && lcc.getRunTimeStatisticsMode();
> +        row = new DataValueDescriptor[ template.length];
> +        for( int i = 0; i < row.length; i++)
> +            row[i] = template[i].getNewNull();
> +        int tempFlags = keepAfterCommit ? (TransactionController.IS_TEMPORARY | TransactionController.IS_KEPT)
> +          : TransactionController.IS_TEMPORARY;
> +        
> +        rowConglomerateId = tc.createConglomerate( "heap",
> +                                                   template,
> +                                                   (ColumnOrdering[]) null,
> +                                                   (Properties) null,
> +                                                   tempFlags);
> +        rowConglomerate = tc.openConglomerate( rowConglomerateId,
> +                                               false, // Do not hold past the end of transaction
> +                                               TransactionController.OPENMODE_FORUPDATE,
> +                                               TransactionController.MODE_TABLE,
> +                                               TransactionController.ISOLATION_NOLOCK /* Single thread only */ );
> +
> +        btreeRow = new DataValueDescriptor[] { new SQLInteger(), rowConglomerate.newRowLocationTemplate()};
> +        Properties btreeProps = new Properties();
> +        btreeProps.put( "baseConglomerateId", String.valueOf( rowConglomerateId));
> +        btreeProps.put( "rowLocationColumn", "1");
> +        btreeProps.put( "allowDuplicates", "false"); // Because the row location is part of the key
> +        btreeProps.put( "nKeyFields", "2"); // Include the row location column
> +        btreeProps.put( "nUniqueColumns", "2"); // Include the row location column
> +        btreeProps.put( "maintainParentLinks", "false");
> +        btreeConglomerateId = tc.createConglomerate( "BTREE",
> +                                                     btreeRow,
> +                                                     (ColumnOrdering[]) null,
> +                                                     btreeProps,
> +                                                     tempFlags);
> +
> +        btreeConglomerate = tc.openConglomerate( btreeConglomerateId,
> +                                                 false, // Do not hold past the end of transaction
> +                                                 TransactionController.OPENMODE_FORUPDATE,
> +                                                 TransactionController.MODE_TABLE,
> +                                                 TransactionController.ISOLATION_NOLOCK /* Single thread only */ );
> +    } // end of constructor
> +
> +    public void close() throws StandardException
> +    {
> +        btreeConglomerate.close();
> +        rowConglomerate.close();
> +        tc.dropConglomerate( btreeConglomerateId);
> +        tc.dropConglomerate( rowConglomerateId);
> +    } // end of close
> +    
> +    /**
> +     * Put a new row in the overflow structure.
> +     *
> +     * @param row The row to be inserted.
> +     * @param hashCode The row's hash code.
> +     *
> +     * @return true if the row was added,
> +     *         false if it was not added (because it was a duplicate and we are eliminating duplicates).
> +     *
> +     * @exception StandardException standard error policy
> +     */
> +    public boolean put( Object key, Object[] row)
> +        throws StandardException
> +    {
> +        boolean isDuplicate = false;
> +        if( remove_duplicates || keepStatistics)
> +        {
> +            // Go to the work of finding out whether it is a duplicate
> +            isDuplicate = (getRemove( key, false, true) != null);
> +            if( remove_duplicates && isDuplicate)
> +                return false;
> +        }
> +        rowConglomerate.insertAndFetchLocation( (DataValueDescriptor[]) row, (RowLocation) btreeRow[1]);
> +        btreeRow[0].setValue( key.hashCode());
> +        btreeConglomerate.insert( btreeRow);
> +        if( keepStatistics && !isDuplicate)
> +            size++;
> +        return true;
> +    } // end of put
> +
> +    /**
> +     * Get a row from the overflow structure.
> +     *
> +     * @param key If the rows only have one key column then the key value. If there is more than one
> +     *            key column then a KeyHasher
> +     *
> +     * @return null if there is no corresponding row,
> +     *         the row (DataValueDescriptor[]) if there is exactly one row with the key
> +     *         a Vector of all the rows with the key if there is more than one.
> +     *
> +     * @exception StandardException
> +     */
> +    public Object get( Object key)
> +        throws StandardException
> +    {
> +        return getRemove( key, false, false);
> +    }
> +
> +    private Object getRemove( Object key, boolean remove, boolean existenceOnly)
> +        throws StandardException
> +    {
> +        int hashCode = key.hashCode();
> +        int rowCount = 0;
> +        Object retValue = null;
> +
> +        scanKey[0].setValue( hashCode);
> +        ScanController scan = tc.openScan( btreeConglomerateId,
> +                                           false, // do not hold
> +                                           remove ? TransactionController.OPENMODE_FORUPDATE : 0,
> +                                           TransactionController.MODE_TABLE,
> +                                           TransactionController.ISOLATION_READ_UNCOMMITTED,
> +                                           null, // Scan all the columns
> +                                           scanKey,
> +                                           ScanController.GE,
> +                                           (Qualifier[][]) null,
> +                                           scanKey,
> +                                           ScanController.GT);
> +        try
> +        {
> +            while( scan.fetchNext( btreeRow))
> +            {
> +                if( rowConglomerate.fetch( (RowLocation) btreeRow[1], row, (FormatableBitSet) null /* all columns */)
> +                    && rowMatches( row, key))
> +                {
> +                    if( existenceOnly)
> +                        return this;
> +
> +                    rowCount++;
> +                    if( rowCount == 1)
> +                        retValue = BackingStoreHashtable.cloneRow( row);
> +                    else 
> +                    {
> +                        Vector v;
> +                        if( rowCount == 2)
> +                        {
> +                            v = new Vector( 2);
> +                            v.add( retValue);
> +                            retValue = v;
> +                        }
> +                        else
> +                            v = (Vector) retValue;
> +                        v.add( BackingStoreHashtable.cloneRow( row));
> +                    }
> +                    if( remove)
> +                    {
> +                        rowConglomerate.delete( (RowLocation) btreeRow[1]);
> +                        scan.delete();
> +                        size--;
> +                    }
> +                    if( remove_duplicates)
> +                        // This must be the only row with the key
> +                        return retValue;
> +                }
> +            }
> +        }
> +        finally
> +        {
> +            scan.close();
> +        }
> +        return retValue;
> +    } // end of getRemove
> +
> +
> +    private boolean rowMatches( DataValueDescriptor[] row,
> +                                Object key)
> +    {
> +        if( key_column_numbers.length == 1)
> +            return row[ key_column_numbers[0]].equals( key);
> +
> +        KeyHasher kh = (KeyHasher) key;
> +        for( int i = 0; i < key_column_numbers.length; i++)
> +        {
> +            if( ! row[ key_column_numbers[i]].equals( kh.getObject(i)))
> +                return false;
> +        }
> +        return true;
> +    } // end of rowMatches
> +
> +    /**
> +     * remove all rows with a given key from the hash table.
> +     *
> +     * @param key          The key of the rows to remove.
> +     *
> +     * @return The removed row(s).
> +     *
> +	 * @exception  StandardException  Standard exception policy.
> +     **/
> +    public Object remove( Object key)
> +		throws StandardException
> +    {
> +        return getRemove( key, true, false);
> +    } // end of remove
> +
> +    /**
> +     * @return The number of rows in the hash table
> +     */
> +    public int size()
> +    {
> +        return size;
> +    }
> +    
> +    /**
> +     * Return an Enumeration that can be used to scan entire table.
> +     * <p>
> +     * RESOLVE - is it worth it to support this routine?
> +     *
> +	 * @return The Enumeration.
> +     *
> +	 * @exception  StandardException  Standard exception policy.
> +     **/
> +    public Enumeration elements()
> +        throws StandardException
> +    {
> +        return new ElementEnum();
> +    }
> +
> +    private class ElementEnum implements Enumeration
> +    {
> +        private ScanController scan;
> +        private boolean hasMore;
> +
> +        ElementEnum()
> +        {
> +            try
> +            {
> +                scan = tc.openScan( rowConglomerateId,
> +                                    false, // do not hold
> +                                    0, // read only
> +                                    TransactionController.MODE_TABLE,
> +                                    TransactionController.ISOLATION_NOLOCK,
> +                                    (FormatableBitSet) null, // all columns
> +                                    (DataValueDescriptor[]) null, // no start key
> +                                    0, // no start key operator
> +                                    (Qualifier[][]) null,
> +                                    (DataValueDescriptor[]) null, // no stop key
> +                                    0 /* no stop key operator */);
> +                hasMore = scan.next();
> +                if( ! hasMore)
> +                {
> +                    scan.close();
> +                    scan = null;
> +                }
> +            }
> +            catch( StandardException se)
> +            {
> +                hasMore = false;
> +                if( scan != null)
> +                {
> +                    try
> +                    {
> +                        scan.close();
> +                    }
> +                    catch( StandardException se1){};
> +                    scan = null;
> +                }
> +            }
> +        } // end of constructor
> +
> +        public boolean hasMoreElements()
> +        {
> +            return hasMore;
> +        }
> +
> +        public Object nextElement()
> +        {
> +            if( ! hasMore)
> +                throw new NoSuchElementException();
> +            try
> +            {
> +                scan.fetch( row);
> +                Object retValue =  BackingStoreHashtable.cloneRow( row);
> +                hasMore = scan.next();
> +                if( ! hasMore)
> +                {
> +                    scan.close();
> +                    scan = null;
> +                }
> +
> +                return retValue;
> +            }
> +            catch( StandardException se)
> +            {
> +                if( scan != null)
> +                {
> +                    try
> +                    {
> +                        scan.close();
> +                    }
> +                    catch( StandardException se1){};
> +                    scan = null;
> +                }
> +                throw new NoSuchElementException();
> +            }
> +        } // end of nextElement
> +    } // end of class ElementEnum
> +}
> Index: java/engine/org/apache/derby/iapi/store/access/BackingStoreHashtable.java
> ===================================================================
> --- java/engine/org/apache/derby/iapi/store/access/BackingStoreHashtable.java	(revision 155029)
> +++ java/engine/org/apache/derby/iapi/store/access/BackingStoreHashtable.java	(working copy)
> @@ -29,10 +29,13 @@
>  import org.apache.derby.iapi.types.CloneableObject;
>  import org.apache.derby.iapi.types.DataValueDescriptor;
>  
> +import org.apache.derby.iapi.services.cache.ClassSize;
> +
>  import java.util.Enumeration;
>  import java.util.Hashtable;
>  import java.util.Properties; 
>  import java.util.Vector;
> +import java.util.NoSuchElementException;
>  
>  /**
>  A BackingStoreHashtable is a utility class which will store a set of rows into
> @@ -102,13 +105,36 @@
>       * Fields of the class
>       **************************************************************************
>       */
> +    private TransactionController tc;
>      private Hashtable   hash_table;
>      private int[]       key_column_numbers;
>      private boolean     remove_duplicates;
>  	private boolean		skipNullKeyColumns;
>      private Properties  auxillary_runtimestats;
>  	private RowSource	row_source;
> +    /* If max_inmemory_rowcnt > 0 then use that to decide when to spill to disk.
> +     * Otherwise compute max_inmemory_size based on the JVM memory size when the BackingStoreHashtable
> +     * is constructed and use that to decide when to spill to disk.
> +     */
> +    private long max_inmemory_rowcnt;
> +    private long inmemory_rowcnt;
> +    private long max_inmemory_size;
> +    private boolean keepAfterCommit;
>  
> +    private static int vectorSize; // The estimated number of bytes used by Vector(0)
> +    static {
> +        try
> +        {
> +            vectorSize = ClassSize.estimateBase( java.util.Vector.class);
> +        }
> +        catch( SecurityException se)
> +        {
> +            vectorSize = 4*ClassSize.refSize;
> +        }
> +    };
> +    
> +    private DiskHashtable diskHashtable;
> +
>      /**************************************************************************
>       * Constructors for This class:
>       **************************************************************************
> @@ -163,7 +189,10 @@
>  	 *
>  	 * @param skipNullKeyColumns	Skip rows with a null key column, if true.
>       *
> +     * @param keepAfterCommit If true the hash table is kept after a commit,
> +     *                        if false the hash table is dropped on the next commit.
>       *
> +     *
>  	 * @exception  StandardException  Standard exception policy.
>       **/
>      public BackingStoreHashtable(
> @@ -175,13 +204,21 @@
>      long                    max_inmemory_rowcnt,
>      int                     initialCapacity,
>      float                   loadFactor,
> -	boolean					skipNullKeyColumns)
> +	boolean					skipNullKeyColumns,
> +    boolean                 keepAfterCommit)
>          throws StandardException
>      {
>          this.key_column_numbers    = key_column_numbers;
>          this.remove_duplicates    = remove_duplicates;
>  		this.row_source			   = row_source;
>  		this.skipNullKeyColumns	   = skipNullKeyColumns;
> +        this.max_inmemory_rowcnt = max_inmemory_rowcnt;
> +        if( max_inmemory_rowcnt > 0)
> +            max_inmemory_size = Long.MAX_VALUE;
> +        else
> +            max_inmemory_size = Runtime.getRuntime().totalMemory()/100;
> +        this.tc = tc;
> +        this.keepAfterCommit = keepAfterCommit;
>  
>          Object[] row;
>  
> @@ -280,7 +317,7 @@
>       *
>  	 * @exception  StandardException  Standard exception policy.
>       **/
> -    private Object[] cloneRow(Object[] old_row)
> +    static Object[] cloneRow(Object[] old_row)
>          throws StandardException
>      {
>          Object[] new_row = new DataValueDescriptor[old_row.length];
> @@ -300,8 +337,6 @@
>       * @param row               Row to add to the hash table.
>       * @param hash_table        The java HashTable to load into.
>       *
> -	 * @return true if successful, false if heap add fails.
> -     *
>  	 * @exception  StandardException  Standard exception policy.
>       **/
>      private void add_row_to_hash_table(
> @@ -310,9 +345,14 @@
>      Object[]    row)
>  		throws StandardException
>      {
> +        if( spillToDisk( hash_table, key, row))
> +            return;
> +        
>          Object  duplicate_value = null;
>  
> -        if ((duplicate_value = hash_table.put(key, row)) != null)
> +        if ((duplicate_value = hash_table.put(key, row)) == null)
> +            doSpaceAccounting( row, false);
> +        else
>          {
>              if (!remove_duplicates)
>              {
> @@ -321,6 +361,7 @@
>                  // inserted a duplicate
>                  if ((duplicate_value instanceof Vector))
>                  {
> +                    doSpaceAccounting( row, false);
>                      row_vec = (Vector) duplicate_value;
>                  }
>                  else
> @@ -330,6 +371,7 @@
>  
>                      // insert original row into vector
>                      row_vec.addElement(duplicate_value);
> +                    doSpaceAccounting( row, true);
>                  }
>  
>                  // insert new row into vector
> @@ -345,6 +387,89 @@
>          row = null;
>      }
>  
> +    private void doSpaceAccounting( Object[] row,
> +                                    boolean firstDuplicate)
> +    {
> +        inmemory_rowcnt++;
> +        if( max_inmemory_rowcnt <= 0)
> +        {
> +            for( int i = 0; i < row.length; i++)
> +            {
> +                if( row[i] instanceof DataValueDescriptor)
> +                    max_inmemory_size -= ((DataValueDescriptor) row[i]).estimateMemoryUsage();
> +                max_inmemory_size -= ClassSize.refSize;
> +            }
> +            max_inmemory_size -= ClassSize.refSize;
> +            if( firstDuplicate)
> +                max_inmemory_size -= vectorSize;
> +        }
> +    } // end of doSpaceAccounting
> +
> +    /**
> +     * Determine whether a new row should be spilled to disk and, if so, do it.
> +     *
> +     * @param hash_table The in-memory hash table
> +     * @param key The row's key
> +     * @param row
> +     *
> +     * @return true if the row was spilled to disk, false if not
> +     *
> +     * @exception  StandardException  Standard exception policy.
> +     */
> +    private boolean spillToDisk( Hashtable   hash_table,
> +                                 Object      key,
> +                                 Object[]    row)
> +		throws StandardException
> +    {
> +        // Once we have started spilling all new rows will go to disk, even if we have freed up some
> +        // memory by moving duplicates to disk. This simplifies handling of duplicates and accounting.
> +        if( diskHashtable == null)
> +        {
> +            if( max_inmemory_rowcnt > 0)
> +            {
> +                if( inmemory_rowcnt < max_inmemory_rowcnt)
> +                    return false; // Do not spill
> +            }
> +            else if( max_inmemory_size > 0)
> +                return false;
> +            // Want to start spilling
> +            if( ! (row instanceof DataValueDescriptor[]))
> +            {
> +                if( SanityManager.DEBUG)
> +                    SanityManager.THROWASSERT( "BackingStoreHashtable row is not DataValueDescriptor[]");
> +                // Do not know how to put it on disk
> +                return false;
> +            }
> +            diskHashtable = new DiskHashtable( tc,
> +                                               (DataValueDescriptor[]) row,
> +                                               key_column_numbers,
> +                                               remove_duplicates,
> +                                               keepAfterCommit);
> +        }
> +        
> +        Object duplicateValue = hash_table.get( key);
> +        if( duplicateValue != null)
> +        {
> +            if( remove_duplicates)
> +                return true; // a degenerate case of spilling
> +            // If we are keeping duplicates then move all the duplicates from memory to disk
> +            // This simplifies finding duplicates: they are either all in memory or all on disk.
> +            if( duplicateValue instanceof Vector)
> +            {
> +                Vector duplicateVec = (Vector) duplicateValue;
> +                for( int i = duplicateVec.size() - 1; i >= 0; i--)
> +                {
> +                    Object[] dupRow = (Object[]) duplicateVec.elementAt(i);
> +                    diskHashtable.put( key, dupRow);
> +                }
> +            }
> +            else
> +                diskHashtable.put( key, (Object []) duplicateValue);
> +            hash_table.remove( key);
> +        }
> +        diskHashtable.put( key, row);
> +        return true;
> +    } // end of spillToDisk
>      /**************************************************************************
>       * Public Methods of This class:
>       **************************************************************************
> @@ -364,6 +489,11 @@
>  		throws StandardException
>      {
>          hash_table = null;
> +        if( diskHashtable != null)
> +        {
> +            diskHashtable.close();
> +            diskHashtable = null;
> +        }
>          return;
>      }
>  
> @@ -380,7 +510,9 @@
>      public Enumeration elements()
>          throws StandardException
>      {
> -        return(hash_table.elements());
> +        if( diskHashtable == null)
> +            return(hash_table.elements());
> +        return new BackingStoreHashtableEnumeration();
>      }
>  
>      /**
> @@ -420,7 +552,10 @@
>      public Object get(Object key)
>  		throws StandardException
>      {
> -        return(hash_table.get(key));
> +        Object obj = hash_table.get(key);
> +        if( diskHashtable == null || obj != null)
> +            return obj;
> +        return diskHashtable.get( key);
>      }
>  
>      /**
> @@ -451,7 +586,10 @@
>      Object      key)
>  		throws StandardException
>      {
> -        return(hash_table.remove(key));
> +        Object obj = hash_table.remove(key);
> +        if( obj != null || diskHashtable == null)
> +            return obj;
> +        return diskHashtable.remove(key);
>      }
>  
>      /**
> @@ -553,7 +691,54 @@
>      public int size()
>  		throws StandardException
>      {
> -        return(hash_table.size());
> +        if( diskHashtable == null)
> +            return(hash_table.size());
> +        return hash_table.size() + diskHashtable.size();
>      }
>  
> +    private class BackingStoreHashtableEnumeration implements Enumeration
> +    {
> +        private Enumeration memoryEnumeration;
> +        private Enumeration diskEnumeration;
> +
> +        BackingStoreHashtableEnumeration()
> +        {
> +            memoryEnumeration = hash_table.elements();
> +            if( diskHashtable != null)
> +            {
> +                try
> +                {
> +                    diskEnumeration = diskHashtable.elements();
> +                }
> +                catch( StandardException se)
> +                {
> +                    diskEnumeration = null;
> +                }
> +            }
> +        }
> +        
> +        public boolean hasMoreElements()
> +        {
> +            if( memoryEnumeration != null)
> +            {
> +                if( memoryEnumeration.hasMoreElements())
> +                    return true;
> +                memoryEnumeration = null;
> +            }
> +            if( diskEnumeration == null)
> +                return false;
> +            return diskEnumeration.hasMoreElements();
> +        }
> +
> +        public Object nextElement() throws NoSuchElementException
> +        {
> +            if( memoryEnumeration != null)
> +            {
> +                if( memoryEnumeration.hasMoreElements())
> +                    return memoryEnumeration.nextElement();
> +                memoryEnumeration = null;
> +            }
> +            return diskEnumeration.nextElement();
> +        }
> +    } // end of class BackingStoreHashtableEnumeration
>  }
> Index: java/testing/org/apache/derbyTesting/functionTests/tests/lang/SpillHash.java
> ===================================================================
> --- java/testing/org/apache/derbyTesting/functionTests/tests/lang/SpillHash.java	(revision 0)
> +++ java/testing/org/apache/derbyTesting/functionTests/tests/lang/SpillHash.java	(revision 0)
> @@ -0,0 +1,459 @@
> +/*
> +
> +   Derby - Class org.apache.derbyTesting.functionTests.tests.lang.bug4356
> +
> +   Copyright 2001, 2004 The Apache Software Foundation or its licensors, as applicable.
> +
> +   Licensed under the Apache License, Version 2.0 (the "License");
> +   you may not use this file except in compliance with the License.
> +   You may obtain a copy of the License at
> +
> +      http://www.apache.org/licenses/LICENSE-2.0
> +
> +   Unless required by applicable law or agreed to in writing, software
> +   distributed under the License is distributed on an "AS IS" BASIS,
> +   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
> +   See the License for the specific language governing permissions and
> +   limitations under the License.
> +
> + */
> +
> +package org.apache.derbyTesting.functionTests.tests.lang;
> +
> +import java.sql.Connection;
> +import java.sql.DriverManager;
> +import java.sql.DatabaseMetaData;
> +import java.sql.ResultSet;
> +import java.sql.PreparedStatement;
> +import java.sql.Statement;
> +import java.sql.SQLException;
> +import java.util.BitSet;
> +
> +import org.apache.derby.tools.ij;
> +import org.apache.derby.tools.JDBCDisplayUtil;
> +
> +/**
> + * Test BackingStoreHashtable spilling to disk.
> + * BackingStoreHashtable is used to implement hash joins, distinct, scroll insensitive cursors,
> + * outer joins, and the HAVING clause.
> + */
> +public class SpillHash
> +{
> +    private static PreparedStatement joinStmt;
> +    private static PreparedStatement distinctStmt;
> +    private static PreparedStatement ojStmt1;
> +    private static final int LOTS_OF_ROWS = 10000;
> +    private static int errorCount = 0;
> +    
> +    public static void main (String args[]) 
> +    {
> +        try {
> +            /* Load the JDBC Driver class */
> +            // use the ij utility to read the property file and
> +            // make the initial connection.
> +            ij.getPropertyArg(args);
> +            Connection conn = ij.startJBMS();
> +            Statement stmt = conn.createStatement();
> +
> +            for( int i = 0; i < prep.length; i++)
> +                stmt.executeUpdate( prep[i]);
> +            PreparedStatement insA = conn.prepareStatement( "insert into ta(ca1,ca2) values(?,?)");
> +            PreparedStatement insB = conn.prepareStatement( "insert into tb(cb1,cb2) values(?,?)");
> +            insertDups( insA, insB, initDupVals);
> +
> +            joinStmt =
> +              conn.prepareStatement( "select ta.ca1, ta.ca2, tb.cb2 from ta, tb where ca1 = cb1");
> +            distinctStmt =
> +              conn.prepareStatement( "select distinct ca1 from ta");
> +            ojStmt1 =
> +              conn.prepareStatement( "select cc1, cc2 from tc group by cc1,cc2 having cc1 in (select ta.ca1 from ta)");
> +
> +            runStatements( conn, 0, new String[][][] {initDupVals});
> +
> +            System.out.println( "Growing database.");
> +            
> +            // Add a lot of rows so that the hash tables have to spill to disk
> +            conn.setAutoCommit(false);
> +            for( int i = 1; i <= LOTS_OF_ROWS; i++)
> +            {
> +                insA.setInt(1, i);
> +                insA.setString(2, ca2Val(i));
> +                insA.executeUpdate();
> +                insB.setInt(1, i);
> +                insB.setString(2, cb2Val(i));
> +                insB.executeUpdate();
> +            }
> +            conn.commit();
> +            insertDups( insA, insB, spillDupVals);
> +            conn.commit();
> +
> +            conn.setAutoCommit(true);
> +            runStatements( conn, LOTS_OF_ROWS, new String[][][] {initDupVals, spillDupVals});
> +            
> +            conn.close();
> +        } catch (Exception e) {
> +            System.out.println("FAIL -- unexpected exception "+e);
> +            JDBCDisplayUtil.ShowException(System.out, e);
> +            e.printStackTrace();
> +            errorCount++;
> +        }
> +        if( errorCount == 0)
> +        {
> +            System.out.println( "PASSED.");
> +            System.exit(0);
> +        }
> +        else
> +        {
> +            System.out.println( "FAILED: " + errorCount + ((errorCount == 1) ? " error" : " errors"));
> +            System.exit(1);
> +        }
> +    } // end of main
> +    
> +    private static final String[] prep =
> +    {
> +        "create table ta (ca1 integer, ca2 char(200))",
> +        "create table tb (cb1 integer, cb2 char(200))",
> +        "create table tc (cc1 integer, cc2 varchar(16))",
> +        "insert into ta(ca1,ca2) values(null, 'Anull')",
> +        "insert into tb(cb1,cb2) values(null, 'Bnull')",
> +        "insert into tc(cc1,cc2) values(0, 'C0')"
> +    };
> +
> +    private static final String[][] initDupVals =
> +    {
> +        { "0a", "0b"},
> +        { "1a", "1b"},
> +        { "2a"}
> +    };
> +    private static final String[][] spillDupVals =
> +    {
> +        {},
> +        { "1c"},
> +        { "2b"},
> +        { "3a", "3b", "3c"}
> +    };
> +
> +    private static void insertDups( PreparedStatement insA, PreparedStatement insB, String[][] dupVals)
> +        throws SQLException
> +    {
> +        for( int i = 0; i < dupVals.length; i++)
> +        {
> +            insA.setInt(1, -i);
> +            insB.setInt(1, -i);
> +            String[] vals = dupVals[i];
> +            for( int j = 0; j < vals.length; j++)
> +            {
> +                insA.setString( 2, "A" + vals[j]);
> +                insA.executeUpdate();
> +                insB.setString( 2, "B" + vals[j]);
> +                insB.executeUpdate();
> +            }
> +        }
> +    } // end of insertDups
> +    
> +    private static String ca2Val( int col1Val)
> +    {
> +        return "A" + col1Val;
> +    }
> +    
> +    private static String cb2Val( int col1Val)
> +    {
> +        return "B" + col1Val;
> +    }
> +    
> +    private static void runStatements( Connection conn, int maxColValue, String[][][] dupVals)
> +        throws SQLException
> +    {
> +        runJoin( conn, maxColValue, dupVals);
> +        runDistinct( conn, maxColValue, dupVals);
> +        runOj( conn, maxColValue, dupVals, "1", ojStmt1);
> +        runCursor( conn, maxColValue, dupVals);
> +    }
> +
> +    private static void runJoin( Connection conn, int maxColValue, String[][][] dupVals)
> +        throws SQLException
> +    {
> +        System.out.println( "Running join");
> +        int expectedRowCount = maxColValue; // plus expected duplicates, to be counted below
> +        ResultSet rs = joinStmt.executeQuery();
> +        BitSet joinRowFound = new BitSet( maxColValue);
> +        int dupKeyCount = 0;
> +        for( int i = 0; i < dupVals.length; i++)
> +        {
> +            if( dupVals[i].length > dupKeyCount)
> +                dupKeyCount = dupVals[i].length;
> +        }
> +        BitSet[] dupsFound = new BitSet[dupKeyCount];
> +        int[] dupCount = new int[ dupKeyCount];
> +        for( int i = 0; i < dupKeyCount; i++)
> +        {
> +            // count the number of rows with column(1) == -i
> +            dupCount[i] = 0;
> +            for( int j = 0; j < dupVals.length; j++)
> +            {
> +                if( i < dupVals[j].length)
> +                    dupCount[i] += dupVals[j][i].length;
> +            }
> +            dupsFound[i] = new BitSet(dupCount[i]*dupCount[i]);
> +            expectedRowCount += dupCount[i]*dupCount[i];
> +        }
> +        
> +        int count;
> +        for( count = 0; rs.next(); count++)
> +        {
> +            int col1Val = rs.getInt(1);
> +            if( rs.wasNull())
> +            {
> +                System.out.println( "Null in join column.");
> +                errorCount++;
> +                continue;
> +            }
> +            if( col1Val > maxColValue)
> +            {
> +                System.out.println( "Invalid value in first join column.");
> +                errorCount++;
> +                continue;
> +            }
> +            if( col1Val > 0)
> +            {
> +                if( joinRowFound.get( col1Val - 1))
> +                {
> +                    System.out.println( "Multiple rows for value " + col1Val);
> +                    errorCount++;
> +                }
> +                joinRowFound.set( col1Val - 1);
> +                String col2Val = trim( rs.getString(2));
> +                String col3Val = trim( rs.getString(3));
> +                if( !( ca2Val( col1Val).equals( col2Val) && cb2Val( col1Val).equals( col3Val)))
> +                {
> +                    System.out.println( "Incorrect value in column 2 or 3 of join.");
> +                    errorCount++;
> +                }
> +            }
> +            else // col1Val <= 0, there are duplicates in the source tables
> +            {
> +                int dupKeyIdx = -col1Val;
> +                int col2Idx = findDupVal( rs, 2, 'A', dupKeyIdx, dupVals);
> +                int col3Idx = findDupVal( rs, 3, 'B', dupKeyIdx, dupVals);
> +                if( col2Idx < 0 || col3Idx < 0)
> +                    continue;
> +
> +                int idx = col2Idx + dupCount[dupKeyIdx]*col3Idx;
> +                if( dupsFound[dupKeyIdx].get( idx))
> +                {
> +                    System.out.println( "Repeat of row with key value 0");
> +                    errorCount++;
> +                }
> +                dupsFound[dupKeyIdx].set( idx);
> +            }
> +        };
> +        if( count != expectedRowCount)
> +        {
> +            System.out.println( "Incorrect number of rows in join.");
> +            errorCount++;
> +        }
> +        rs.close();
> +    } // end of runJoin
> +
> +    private static int findDupVal( ResultSet rs, int col, char prefix, int keyIdx, String[][][] dupVals)
> +        throws SQLException
> +    {
> +        String colVal = rs.getString(col);
> +        if( colVal != null && colVal.length() > 1 || colVal.charAt(0) == prefix)
> +        {
> +            colVal = trim( colVal.substring( 1));
> +            int dupIdx = 0;
> +            for( int i = 0; i < dupVals.length; i++)
> +            {
> +                if( keyIdx < dupVals[i].length)
> +                {
> +                    for( int j = 0; j < dupVals[i][keyIdx].length; j++, dupIdx++)
> +                    {
> +                        if( colVal.equals( dupVals[i][keyIdx][j]))
> +                            return dupIdx;
> +                    }
> +                }
> +            }
> +        }
> +        System.out.println( "Incorrect value in column " + col + " of join with duplicate keys.");
> +        errorCount++;
> +        return -1;
> +    } // end of findDupVal
> +        
> +    private static String trim( String str)
> +    {
> +        if( str == null)
> +            return str;
> +        return str.trim();
> +    }
> +    
> +    private static void runDistinct( Connection conn, int maxColValue, String[][][] dupVals)
> +        throws SQLException
> +    {
> +        System.out.println( "Running distinct");
> +        ResultSet rs = distinctStmt.executeQuery();
> +        checkAllCa1( rs, false, maxColValue, dupVals, "DISTINCT");
> +    }
> +
> +    private static void checkAllCa1( ResultSet rs, boolean expectDups, int maxColValue, String[][][] dupVals, String label)
> +        throws SQLException
> +    {
> +        int dupKeyCount = 0;
> +        for( int i = 0; i < dupVals.length; i++)
> +        {
> +            if( dupVals[i].length > dupKeyCount)
> +                dupKeyCount = dupVals[i].length;
> +        }
> +        int[] expectedDupCount = new int[dupKeyCount];
> +        int[] dupFoundCount = new int[dupKeyCount];
> +        for( int i = 0; i < dupKeyCount; i++)
> +        {
> +            
> +            dupFoundCount[i] = 0;
> +            if( !expectDups)
> +                expectedDupCount[i] = 1;
> +            else
> +            {
> +                expectedDupCount[i] = 0;
> +                for( int j = 0; j < dupVals.length; j++)
> +                {
> +                    if( i < dupVals[j].length)
> +                        expectedDupCount[i] += dupVals[j][i].length;
> +                }
> +            }
> +        }
> +        BitSet found = new BitSet( maxColValue);
> +        int count = 0;
> +        boolean nullFound = false;
> +        try
> +        {
> +            for( count = 0; rs.next();)
> +            {
> +                int col1Val = rs.getInt(1);
> +                if( rs.wasNull())
> +                {
> +                    if( nullFound)
> +                    {
> +                        System.out.println( "Too many nulls returned by " + label);
> +                        errorCount++;
> +                        continue;
> +                    }
> +                    nullFound = true;
> +                    continue;
> +                }
> +                if( col1Val <= -dupKeyCount || col1Val > maxColValue)
> +                {
> +                    System.out.println( "Invalid value returned by " + label);
> +                    errorCount++;
> +                    continue;
> +                }
> +                if( col1Val <= 0)
> +                {
> +                    dupFoundCount[ -col1Val]++;
> +                    if( !expectDups)
> +                    {
> +                        if( dupFoundCount[ -col1Val] > 1)
> +                        {
> +                            System.out.println( label + " returned a duplicate.");
> +                            errorCount++;
> +                            continue;
> +                        }
> +                    }
> +                    else if( dupFoundCount[ -col1Val] > expectedDupCount[ -col1Val])
> +                    {
> +                        System.out.println( label + " returned too many duplicates.");
> +                        errorCount++;
> +                        continue;
> +                    }
> +                }
> +                else
> +                {
> +                    if( found.get( col1Val))
> +                    {
> +                        System.out.println( label + " returned a duplicate.");
> +                        errorCount++;
> +                        continue;
> +                    }
> +                    found.set( col1Val);
> +                    count++;
> +                }
> +            }
> +            if( count != maxColValue)
> +            {
> +                System.out.println( "Incorrect number of rows in " + label);
> +                errorCount++;
> +            }
> +            for( int i = 0; i < dupFoundCount.length; i++)
> +            {
> +                if( dupFoundCount[i] != expectedDupCount[i])
> +                {
> +                    System.out.println( "A duplicate key row is missing in " + label);
> +                    errorCount++;
> +                    break;
> +                }
> +            }
> +        }
> +        finally
> +        {
> +            rs.close();
> +        }
> +    } // End of checkAllCa1
> +
> +    private static void runOj( Connection conn,
> +                               int maxColValue,
> +                               String[][][] dupVals,
> +                               String ojId,
> +                               PreparedStatement ojStmt)
> +        throws SQLException
> +    {
> +        System.out.println( "Running outer join " + ojId);
> +        ResultSet rs = ojStmt.executeQuery();
> +        try
> +        {
> +            if( ! rs.next())
> +            {
> +                System.out.println( "Outer join " + ojId + " returned no rows.");
> +                errorCount++;
> +                return;
> +            }
> +            int cc1 = rs.getInt(1);
> +            if( rs.wasNull())
> +            {
> +                System.out.println( "Outer join " + ojId + " returned null");
> +                errorCount++;
> +                return;
> +            }
> +            String cc2 = trim(rs.getString(2));
> +            if( rs.wasNull())
> +            {
> +                System.out.println( "Outer join " + ojId + " returned null");
> +                errorCount++;
> +                return;
> +            }
> +            if( cc1 != 0 || ! "C0".equals( cc2))
> +            {
> +                System.out.println( "Outer join " + ojId + " returned wrong values");
> +                errorCount++;
> +                return;
> +            }
> +            if( rs.next())
> +            {
> +                System.out.println( "Outer join " + ojId + " returned too many rows.");
> +                errorCount++;
> +            }
> +        }
> +        finally
> +        {
> +            rs.close();
> +        }
> +    } // End of runOj
> +
> +    private static void runCursor( Connection conn, int maxColValue, String[][][] dupVals)
> +        throws SQLException
> +    {
> +        System.out.println( "Running scroll insensitive cursor");
> +        Statement stmt = conn.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE, ResultSet.CONCUR_READ_ONLY);
> +        ResultSet rs = stmt.executeQuery( "SELECT ca1 FROM ta");
> +        checkAllCa1( rs, true, maxColValue, dupVals, "scroll insensitive cursor");
> +    }
> +}
> Index: java/testing/org/apache/derbyTesting/functionTests/tests/store/TestDiskHashtable.java
> ===================================================================
> --- java/testing/org/apache/derbyTesting/functionTests/tests/store/TestDiskHashtable.java	(revision 0)
> +++ java/testing/org/apache/derbyTesting/functionTests/tests/store/TestDiskHashtable.java	(revision 0)
> @@ -0,0 +1,432 @@
> +/*
> +
> +   Derby - Class org.apache.derbyTesting.functionTests.tests.store.TestDiskHashtable
> +
> +   Copyright 2005 The Apache Software Foundation or its licensors, as applicable.
> +
> +   Licensed under the Apache License, Version 2.0 (the "License");
> +   you may not use this file except in compliance with the License.
> +   You may obtain a copy of the License at
> +
> +      http://www.apache.org/licenses/LICENSE-2.0
> +
> +   Unless required by applicable law or agreed to in writing, software
> +   distributed under the License is distributed on an "AS IS" BASIS,
> +   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
> +   See the License for the specific language governing permissions and
> +   limitations under the License.
> +
> + */
> +
> +package org.apache.derbyTesting.functionTests.tests.store;
> +
> +import java.sql.Connection;
> +import java.sql.ResultSet;
> +import java.sql.SQLException;
> +import java.sql.Statement;
> +
> +import java.util.BitSet;
> +import java.util.Enumeration;
> +import java.util.HashMap;
> +import java.util.Vector;
> +
> +import org.apache.derby.iapi.error.PublicAPI;
> +import org.apache.derby.iapi.error.StandardException;
> +import org.apache.derby.iapi.sql.conn.ConnectionUtil;
> +import org.apache.derby.iapi.sql.conn.LanguageConnectionContext;
> +import org.apache.derby.iapi.store.access.DiskHashtable;
> +import org.apache.derby.iapi.store.access.KeyHasher;
> +import org.apache.derby.iapi.store.access.TransactionController;
> +import org.apache.derby.iapi.types.DataValueDescriptor;
> +import org.apache.derby.iapi.types.Orderable;
> +import org.apache.derby.iapi.types.SQLInteger;
> +import org.apache.derby.iapi.types.SQLLongint;
> +import org.apache.derby.iapi.types.SQLVarchar;
> +import org.apache.derby.tools.ij;
> +import org.apache.derbyTesting.functionTests.util.TestUtil;
> +
> +/**
> + * This program tests the org.apache.derby.iapi.store.access.DiskHashtable class.
> + * The unit test interface is not used because that is undocumented and very difficult to decipher.
> + * Furthermore it is difficult to diagnose problems when using the unit test interface.
> + *
> + * Created: Wed Feb 09 15:44:12 2005
> + *
> + * @author <a href="mailto:klebanof@us.ibm.com">Jack Klebanoff</a>
> + * @version 1.0
> + */
> +public class TestDiskHashtable 
> +{
> +    private TransactionController tc;
> +    private int failed = 0;
> +    
> +    public static void main( String args[])
> +    {
> +        int failed = 1;
> +
> +		REPORT("Test DiskHashtable starting");
> +        try
> +        {
> +			// use the ij utility to read the property file and
> +			// make the initial connection.
> +			ij.getPropertyArg(args);
> +			Connection conn = ij.startJBMS();
> +            Statement stmt = conn.createStatement();
> +            stmt.execute("CREATE FUNCTION testDiskHashtable() returns INTEGER EXTERNAL NAME 'org.apache.derbyTesting.functionTests.tests.store.TestDiskHashtable.runTests' LANGUAGE JAVA PARAMETER STYLE JAVA");
> +            ResultSet rs = stmt.executeQuery( "values( testDiskHashtable())");
> +            if( rs.next())
> +                failed = rs.getInt(1);
> +            stmt.close();
> +            conn.close();
> +        }
> +        catch( SQLException e)
> +        {
> +			TestUtil.dumpSQLExceptions( e);
> +            failed = 1;
> +        }
> +        catch( Throwable t)
> +        {
> +			REPORT("FAIL -- unexpected exception:" + t.toString());
> +            failed = 1;
> +		}
> +        REPORT( (failed == 0) ? "OK" : "FAILED");
> +        System.exit( (failed == 0) ? 0 : 1);
> +    }
> +
> +    private void REPORT_FAILURE(String msg)
> +    {
> +        failed = 1;
> +        REPORT( msg);
> +    }
> +    
> +    private static void REPORT(String msg)
> +    {
> +        System.out.println( msg);
> +    }
> +    
> +    public static int runTests() throws SQLException
> +    {
> +        TestDiskHashtable tester = new TestDiskHashtable();
> +        return tester.doIt();
> +    }
> +
> +    private TestDiskHashtable() throws SQLException
> +    {
> +        LanguageConnectionContext lcc = ConnectionUtil.getCurrentLCC();
> +        if( lcc == null)
> +            throw new SQLException( "Cannot get the LCC");
> +        tc = lcc.getTransactionExecute();
> +    }
> +
> +    private int doIt() throws SQLException
> +    {
> +		try {
> +
> +
> +            REPORT( "Starting single key, keep duplicates test");
> +            testOneVariant( tc, false, singleKeyTemplate, singleKeyCols, singleKeyRows);
> +            REPORT( "Starting single key, remove duplicates test");
> +            testOneVariant( tc, true, singleKeyTemplate, singleKeyCols, singleKeyRows);
> +            REPORT( "Starting multiple key, keep duplicates test");
> +            testOneVariant( tc, false, multiKeyTemplate, multiKeyCols, multiKeyRows);
> +            REPORT( "Starting multiple key, remove duplicates test");
> +            testOneVariant( tc, true, multiKeyTemplate, multiKeyCols, multiKeyRows);
> +
> +			tc.commit();
> +		}
> +		catch (StandardException se)
> +		{
> +            throw PublicAPI.wrapStandardException( se);
> +        }
> +        return failed;
> +    } // end of doIt
> +
> +    private static final DataValueDescriptor[] singleKeyTemplate = { new SQLInteger(), new SQLVarchar()};
> +    private static final int[] singleKeyCols = {0};
> +    private static final DataValueDescriptor[][] singleKeyRows =
> +    {
> +        {new SQLInteger(1), new SQLVarchar("abcd")},
> +        {new SQLInteger(2), new SQLVarchar("abcd")},
> +        {new SQLInteger(3), new SQLVarchar("e")},
> +        {new SQLInteger(1), new SQLVarchar("zz")}
> +    };
> +
> +    private static final DataValueDescriptor[] multiKeyTemplate = { new SQLLongint(), new SQLVarchar(), new SQLInteger()};
> +    private static final int[] multiKeyCols = {1, 0};
> +    private static final DataValueDescriptor[][] multiKeyRows =
> +    {
> +        {new SQLLongint(1), new SQLVarchar( "aa"), multiKeyTemplate[2].getNewNull()},
> +        {new SQLLongint(2), new SQLVarchar( "aa"), new SQLInteger(1)},
> +        {new SQLLongint(2), new SQLVarchar( "aa"), new SQLInteger(2)},
> +        {new SQLLongint(2), new SQLVarchar( "b"), new SQLInteger(1)}
> +    };
> +
> +    private static final int LOTS_OF_ROWS_COUNT = 50000;
> +    
> +    private void testOneVariant( TransactionController tc,
> +                                 boolean removeDups,
> +                                 DataValueDescriptor[] template,
> +                                 int[] keyCols,
> +                                 DataValueDescriptor[][] rows)
> +        throws StandardException
> +    {
> +        DiskHashtable dht = new DiskHashtable(tc, template, keyCols, removeDups, false);
> +        boolean[] isDuplicate = new boolean[ rows.length];
> +        boolean[] found = new boolean[ rows.length];
> +        HashMap simpleHash = new HashMap( rows.length);
> +
> +        testElements( removeDups, dht, keyCols, 0, rows, simpleHash, isDuplicate, found);
> +
> +        for( int i = 0; i < rows.length; i++)
> +        {
> +            Object key = KeyHasher.buildHashKey( rows[i], keyCols);
> +            Vector al = (Vector) simpleHash.get( key);
> +            isDuplicate[i] = (al != null);
> +            if( al == null)
> +            {
> +                al = new Vector(4);
> +                simpleHash.put( key, al);
> +            }
> +            if( (!removeDups) || !isDuplicate[i])
> +                al.add( rows[i]);
> +            
> +            if( dht.put( key, rows[i]) != (removeDups ? (!isDuplicate[i]) : true))
> +                REPORT_FAILURE( "  put returned wrong value on row " + i);
> +
> +            for( int j = 0; j <= i; j++)
> +            {
> +                key = KeyHasher.buildHashKey( rows[j], keyCols);
> +                if( ! rowsEqual( dht.get( key), simpleHash.get( key)))
> +                    REPORT_FAILURE( "  get returned wrong value on key " + j);
> +            }
> +
> +            testElements( removeDups, dht, keyCols, i+1, rows, simpleHash, isDuplicate, found);
> +        }
> +        // Remove them
> +        for( int i = 0; i < rows.length; i++)
> +        {
> +            Object key = KeyHasher.buildHashKey( rows[i], keyCols);
> +            if( ! rowsEqual( dht.remove( key), simpleHash.get( key)))
> +                REPORT_FAILURE( "  remove returned wrong value on key " + i);
> +            simpleHash.remove( key);
> +            if( dht.get( key) != null)
> +                REPORT_FAILURE( "  remove did not delete key " + i);
> +        }
> +        testElements( removeDups, dht, keyCols, 0, rows, simpleHash, isDuplicate, found);
> +
> +        testLargeTable( dht, keyCols, rows[0]);
> +        dht.close();
> +    } // end of testOneVariant
> +
> +    private void testLargeTable( DiskHashtable dht,
> +                                 int[] keyCols,
> +                                 DataValueDescriptor[] aRow)
> +        throws StandardException
> +    {
> +        // Add a lot of elements
> +        // If there are two or more key columns then we will vary the first two key columns, using an approximately
> +        // square matrix of integer key values. Because the hash generator is commutative key (i,j) hashes into the
> +        // same bucket as key (j,i), testing the case where different keys hash into the same bucket.
> +        int key1Count = (keyCols.length > 1) ? ((int) Math.round( Math.sqrt( (double) LOTS_OF_ROWS_COUNT))) : 1;
> +        int key0Count = (LOTS_OF_ROWS_COUNT + key1Count - 1)/key1Count;
> +
> +        DataValueDescriptor[] row = new DataValueDescriptor[ aRow.length];
> +        for( int i = 0; i < row.length; i++)
> +            row[i] = aRow[i].getClone();
> +        
> +        for( int key0Idx = 0; key0Idx < key0Count; key0Idx++)
> +        {
> +            row[ keyCols[0]].setValue( key0Idx);
> +            for( int key1Idx = 0; key1Idx < key1Count; key1Idx++)
> +            {
> +                if( keyCols.length > 1)
> +                    row[ keyCols[1]].setValue( key1Idx);
> +                Object key = KeyHasher.buildHashKey( row, keyCols);
> +                if( ! dht.put( key, row))
> +                {
> +                    REPORT_FAILURE( "  put returned wrong value for key(" + key0Idx + "," + key1Idx + ")");
> +                    key0Idx = key0Count;
> +                    break;
> +                }
> +            }
> +        }
> +        for( int key0Idx = 0; key0Idx < key0Count; key0Idx++)
> +        {
> +            row[ keyCols[0]].setValue( key0Idx);
> +            for( int key1Idx = 0; key1Idx < key1Count; key1Idx++)
> +            {
> +                if( keyCols.length > 1)
> +                    row[ keyCols[1]].setValue( key1Idx);
> +                Object key = KeyHasher.buildHashKey( row, keyCols);
> +                if( ! rowsEqual( dht.get( key), row))
> +                {
> +                    REPORT_FAILURE( "  large table get returned wrong value for key(" + key0Idx + "," + key1Idx + ")");
> +                    key0Idx = key0Count;
> +                    break;
> +                }
> +            }
> +        }
> +        BitSet found = new BitSet(key0Count * key1Count);
> +        Enumeration elements = dht.elements();
> +        while( elements.hasMoreElements())
> +        {
> +            Object el = elements.nextElement();
> +            if( ! (el instanceof DataValueDescriptor[]))
> +            {
> +                REPORT_FAILURE( "  large table enumeration returned wrong element type");
> +                break;
> +            }
> +            DataValueDescriptor[] fetchedRow = (DataValueDescriptor[]) el;
> +            
> +            int i = fetchedRow[ keyCols[0]].getInt() * key1Count;
> +            if( keyCols.length > 1)
> +                i += fetchedRow[ keyCols[1]].getInt();
> +            if( i >= key0Count * key1Count)
> +            {
> +                REPORT_FAILURE( "  large table enumeration returned invalid element");
> +                break;
> +            }
> +                
> +            if( found.get(i))
> +            {
> +                REPORT_FAILURE( "  large table enumeration returned same element twice");
> +                break;
> +            }
> +            found.set(i);
> +        }
> +        for( int i = key0Count * key1Count - 1; i >= 0; i--)
> +        {
> +            if( !found.get(i))
> +            {
> +                REPORT_FAILURE( "  large table enumeration missed at least one element");
> +                break;
> +            }
> +        }
> +    } // end of testLargeTable
> +
> +    private void testElements( boolean removeDups,
> +                               DiskHashtable dht,
> +                               int[] keyCols,
> +                               int rowCount,
> +                               DataValueDescriptor[][] rows,
> +                               HashMap simpleHash,
> +                               boolean[] isDuplicate,
> +                               boolean[] found)
> +        throws StandardException
> +    {
> +        for( int i = 0; i < rowCount; i++)
> +            found[i] = false;
> +        
> +        for( Enumeration e = dht.elements(); e.hasMoreElements();)
> +        {
> +            Object el = e.nextElement();
> +            if( el == null)
> +            {
> +                REPORT_FAILURE( "  table enumeration returned a null element");
> +                return;
> +            }
> +            if( el instanceof DataValueDescriptor[])
> +                checkElement( (DataValueDescriptor[]) el, rowCount, rows, found);
> +            else if( el instanceof Vector)
> +            {
> +                Vector v = (Vector) el;
> +                for( int i = 0; i < v.size(); i++)
> +                    checkElement( (DataValueDescriptor[]) v.get(i), rowCount, rows, found);
> +            }
> +            else if( el == null)
> +            {
> +                REPORT_FAILURE( "  table enumeration returned an incorrect element type");
> +                return;
> +            }
> +        }
> +        for( int i = 0; i < rowCount; i++)
> +        {
> +            if( (removeDups && isDuplicate[i]))
> +            {
> +                if( found[i])
> +                {
> +                    REPORT_FAILURE( "  table enumeration did not remove duplicates");
> +                    return;
> +                }
> +            }
> +            else if( ! found[i])
> +            {
> +                REPORT_FAILURE( "  table enumeration missed at least one element");
> +                return;
> +            }
> +        }
> +    } // end of testElements
> +
> +    private void checkElement( DataValueDescriptor[] fetchedRow,
> +                               int rowCount,
> +                               DataValueDescriptor[][] rows,
> +                               boolean[] found)
> +        throws StandardException
> +    {
> +        for( int i = 0; i < rowCount; i++)
> +        {
> +            if( rowsEqual( fetchedRow, rows[i]))
> +            {
> +                if( found[i])
> +                {
> +                    REPORT_FAILURE( "  table enumeration returned the same element twice");
> +                    return;
> +                }
> +                found[i] = true;
> +                return;
> +            }
> +        }
> +        REPORT_FAILURE( "  table enumeration returned an incorrect element");
> +    } // end of checkElement
> +
> +    private boolean rowsEqual( Object r1, Object r2)
> +        throws StandardException
> +    {
> +        if( r1 == null)
> +            return r2 == null;
> +
> +        if( r1 instanceof DataValueDescriptor[])
> +        {
> +            DataValueDescriptor[] row1 = (DataValueDescriptor[]) r1;
> +            DataValueDescriptor[] row2;
> +            
> +            if( r2 instanceof Vector)
> +            {
> +                Vector v2 = (Vector) r2;
> +                if( v2.size() != 1)
> +                    return false;
> +                row2 = (DataValueDescriptor[]) v2.elementAt(0);
> +            }
> +            else if( r2 instanceof DataValueDescriptor[])
> +                row2 = (DataValueDescriptor[]) r2;
> +            else
> +                return false;
> +            
> +            if( row1.length != row2.length)
> +                return false;
> +            for( int i = 0; i < row1.length; i++)
> +            {
> +                if( ! row1[i].compare( Orderable.ORDER_OP_EQUALS, row2[i], true, true))
> +                    return false;
> +            }
> +            return true;
> +        }
> +        if( r1 instanceof Vector)
> +        {
> +            if( !(r2 instanceof Vector))
> +                return false;
> +            Vector v1 = (Vector) r1;
> +            Vector v2 = (Vector) r2;
> +            if( v1.size() != v2.size())
> +                return false;
> +            for( int i = v1.size() - 1; i >= 0; i--)
> +            {
> +                if( ! rowsEqual( v1.elementAt( i), v2.elementAt(i)))
> +                    return false;
> +            }
> +            return true;
> +        }
> +        // What is it then?
> +        return r1.equals( r2);
> +    } // end of rowsEqual
> +}
> Index: java/testing/org/apache/derbyTesting/functionTests/master/TestDiskHashtable.out
> ===================================================================
> --- java/testing/org/apache/derbyTesting/functionTests/master/TestDiskHashtable.out	(revision 0)
> +++ java/testing/org/apache/derbyTesting/functionTests/master/TestDiskHashtable.out	(revision 0)
> @@ -0,0 +1,6 @@
> +Test DiskHashtable starting
> +Starting single key, keep duplicates test
> +Starting single key, remove duplicates test
> +Starting multiple key, keep duplicates test
> +Starting multiple key, remove duplicates test
> +OK
> Index: java/testing/org/apache/derbyTesting/functionTests/master/SpillHash.out
> ===================================================================
> --- java/testing/org/apache/derbyTesting/functionTests/master/SpillHash.out	(revision 0)
> +++ java/testing/org/apache/derbyTesting/functionTests/master/SpillHash.out	(revision 0)
> @@ -0,0 +1,10 @@
> +Running join
> +Running distinct
> +Running outer join 1
> +Running scroll insensitive cursor
> +Growing database.
> +Running join
> +Running distinct
> +Running outer join 1
> +Running scroll insensitive cursor
> +PASSED.
> Index: java/testing/org/apache/derbyTesting/functionTests/suites/derbylang.runall
> ===================================================================
> --- java/testing/org/apache/derbyTesting/functionTests/suites/derbylang.runall	(revision 155029)
> +++ java/testing/org/apache/derbyTesting/functionTests/suites/derbylang.runall	(working copy)
> @@ -107,6 +107,7 @@
>  lang/select.sql
>  lang/simpleThreadWrapper.java
>  lang/specjPlans.sql
> +lang/SpillHash.java
>  lang/staleplans.sql
>  lang/stmtCache0.sql
>  lang/stmtCache1.sql
> Index: build.xml
> ===================================================================
> --- build.xml	(revision 155029)
> +++ build.xml	(working copy)
> @@ -413,6 +413,7 @@
>        <arg value="java.math.BigDecimal"/>
>        <arg value="java.util.ArrayList"/>
>        <arg value="java.util.GregorianCalendar"/>
> +      <arg value="java.util.Vector"/>
>      </java>
>  
>      <javac