You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-commits@db.apache.org by mi...@apache.org on 2005/06/13 17:18:53 UTC
svn commit: r190415 -
/incubator/derby/code/trunk/java/engine/org/apache/derby/iapi/services/io
/incubator/derby/code/trunk/java/engine/org/apache/derby/iapi/types
/incubator/derby/code/trunk/java/engine/org/apache/derby/impl/jdbc
Author: mikem
Date: Mon Jun 13 08:18:51 2005
New Revision: 190415
URL: http://svn.apache.org/viewcvs?rev=190415&view=rev
Log:
Fix for DERBY-302, committing on behalf of
DERBY-302 - Insert on Clob of 500k of data using streams takes long time.
It takes 3 mins on a sun jvm1.4.2 and 20 seconds with ibm jvm 1.4.2.
The following fix improves the performance of insert into a 500k blob from 20
seconds to around 1 second. Note that by changing just the test program
time was reduced from 3 minutes to avg around 20 seconds.
Currently in derby, for an insert on a clob using setCharacterStream what will
happen is , the entire stream will be materialized into a char array and sent
to store for the insert. ( we should not have to stream here. I will file
another jira issue for this and put in all information I learnt)
Given this is how inserts for large clobs are happening, the performance issue
analysis is as follows:
-- profiler run shows that most time is spent in SQLChar.readExternal which
is where the materialization into a char array for the user's input stream
happens. The growth of this array happens gradually till the entire stream
is materialized into the array. Below code snippet shows by how much the
array is grown each time when it realizes it has to read more bytes from the
stream.
The dvd hierarchy for clob is - SQLClob ( dvd) extends SQLVarChar
extends SQLChar.
So in SQLChar.readExternal
........
int growby = in.available();
if(growby < 64)
growby = 64
and then an allocation and an arraycopy to the new allocated array.
-- In the code snippet, 'in' is the wrapper around the user's stream which is
ReaderToUTF8Stream. ReaderToUTF8Stream extends InputStream and does not
override available() method . As per the spec, InputStream.available()
returns 0.
-- Thus each time, the array growth is by 64 bytes which is obviously not
performant. so for a 500k clob insert, this would mean allocation &
arraycopy steps happen ~8000 times.
-- The ReaderToUTF8Stream that has the user's stream reads from the stream and
does the utf8 conversion and puts it in a 4k array. I think it is reasonable
to have a 32k buffer to store this information for clobs.
Although I think there seems to be more possible optimizations in this area,
I prefer the incremental approach too :) so this patch is a first step
towards fixing the insert clob performance in the current system.
Fix includes:
-- enhanced the way the array was grown to keep the original 64 bytes for
char ( seems reasonable given the upper limit for char) but override it to
have 4k for varchar and clobs.
-- OVERRIDE AVAILABLE() IN READERTOUTF8STREAM TO RETURN A BETTER ESTIMATE OF
HOW MANY BYTES CAN BE READ.
Modified:
incubator/derby/code/trunk/java/engine/org/apache/derby/iapi/services/io/LimitReader.java
incubator/derby/code/trunk/java/engine/org/apache/derby/iapi/types/SQLChar.java
incubator/derby/code/trunk/java/engine/org/apache/derby/iapi/types/SQLVarchar.java
incubator/derby/code/trunk/java/engine/org/apache/derby/impl/jdbc/ReaderToUTF8Stream.java
Modified: incubator/derby/code/trunk/java/engine/org/apache/derby/iapi/services/io/LimitReader.java
URL: http://svn.apache.org/viewcvs/incubator/derby/code/trunk/java/engine/org/apache/derby/iapi/services/io/LimitReader.java?rev=190415&r1=190414&r2=190415&view=diff
==============================================================================
--- incubator/derby/code/trunk/java/engine/org/apache/derby/iapi/services/io/LimitReader.java (original)
+++ incubator/derby/code/trunk/java/engine/org/apache/derby/iapi/services/io/LimitReader.java Mon Jun 13 08:18:51 2005
@@ -119,6 +119,16 @@
limitInPlace = true;
return;
}
+
+ /**
+ * return limit of the stream that can be read without throwing
+ * EOFException
+ * @return the remaining bytes left to be read from the stream
+ */
+ public final int getLimit()
+ {
+ return remainingBytes;
+ }
/**
Clear any limit set by setLimit. After this call no limit checking
Modified: incubator/derby/code/trunk/java/engine/org/apache/derby/iapi/types/SQLChar.java
URL: http://svn.apache.org/viewcvs/incubator/derby/code/trunk/java/engine/org/apache/derby/iapi/types/SQLChar.java?rev=190415&r1=190414&r2=190415&view=diff
==============================================================================
--- incubator/derby/code/trunk/java/engine/org/apache/derby/iapi/types/SQLChar.java (original)
+++ incubator/derby/code/trunk/java/engine/org/apache/derby/iapi/types/SQLChar.java Mon Jun 13 08:18:51 2005
@@ -86,6 +86,18 @@
extends DataType implements StringDataValue, StreamStorable
{
+ /**
+ * threshold, that decides when we return space back to the VM
+ * see getString() where it is used
+ */
+ protected final static int RETURN_SPACE_THRESHOLD = 4096;
+
+ /**
+ * when we know that the array needs to grow by at least
+ * one byte, it is not performant to grow by just one byte
+ * instead this amount is used to provide a reasonable growby size.
+ */
+ private final static int GROWBY_FOR_CHAR = 64;
/**
Static array that can be used for blank padding.
*/
@@ -335,7 +347,7 @@
// data is stored in the char[] array
value = new String(rawData, 0, len);
- if (len > 4096) {
+ if (len > RETURN_SPACE_THRESHOLD) {
// free up this char[] array to reduce memory usage
rawData = null;
rawLength = -1;
@@ -641,6 +653,11 @@
int utflen = in.readUnsignedShort();
int requiredLength;
+ // minimum amount that is reasonable to grow the array
+ // when we know the array needs to growby at least one
+ // byte but we dont want to grow by one byte as that
+ // is not performant
+ int minGrowBy = growBy();
if (utflen != 0)
{
// the object was not stored as a streaming column
@@ -654,8 +671,8 @@
// OR
// The original string was a 0 length string.
requiredLength = in.available();
- if (requiredLength < 64)
- requiredLength = 64;
+ if (requiredLength < minGrowBy)
+ requiredLength = minGrowBy;
}
char str[];
@@ -707,12 +724,26 @@
if (strlen >= arrayLength) // the char array needs to be grown
{
int growby = in.available();
-
- // We know at the array needs to be grown by at least one.
+ // We know that the array needs to be grown by at least one.
// However, even if the input stream wants to block on every
// byte, we don't want to grow by a byte at a time.
- if (growby < 64)
- growby = 64;
+ // Note, for large data (clob > 32k), it is performant
+ // to grow the array by atleast 4k rather than a small amount
+ // Even better maybe to grow by 32k but then may be
+ // a little excess(?) for small data.
+ // hopefully in.available() will give a fair
+ // estimate of how much data can be read to grow the
+ // array by larger and necessary chunks.
+ // This performance issue due to
+ // the slow growth of this array was noticed since inserts
+ // on clobs was taking a really long time as
+ // the array here grew previously by 64 bytes each time
+ // till stream was drained. (Derby-302)
+ // for char, growby 64 seems reasonable, but for varchar
+ // clob 4k or 32k is performant and hence
+ // growBy() is override correctly to ensure this
+ if (growby < minGrowBy)
+ growby = minGrowBy;
int newstrlength = arrayLength + growby;
char oldstr[] = str;
@@ -804,6 +835,18 @@
cKey = null;
}
+ /**
+ * returns the reasonable minimum amount by
+ * which the array can grow . See readExternal.
+ * when we know that the array needs to grow by at least
+ * one byte, it is not performant to grow by just one byte
+ * instead this amount is used to provide a resonable growby size.
+ * @return minimum reasonable growby size
+ */
+ protected int growBy()
+ {
+ return GROWBY_FOR_CHAR; //seems reasonable for a char
+ }
/**
* @see Storable#restoreToNull
*
Modified: incubator/derby/code/trunk/java/engine/org/apache/derby/iapi/types/SQLVarchar.java
URL: http://svn.apache.org/viewcvs/incubator/derby/code/trunk/java/engine/org/apache/derby/iapi/types/SQLVarchar.java?rev=190415&r1=190414&r2=190415&view=diff
==============================================================================
--- incubator/derby/code/trunk/java/engine/org/apache/derby/iapi/types/SQLVarchar.java (original)
+++ incubator/derby/code/trunk/java/engine/org/apache/derby/iapi/types/SQLVarchar.java Mon Jun 13 08:18:51 2005
@@ -182,4 +182,17 @@
{
return TypeId.VARCHAR_PRECEDENCE;
}
+
+ /**
+ * returns the reasonable minimum amount by
+ * which the array can grow . See readExternal.
+ * when we know that the array needs to grow by at least
+ * one byte, it is not performant to grow by just one byte
+ * instead this amount is used to provide a resonable growby size.
+ * @return minimum reasonable growby size
+ */
+ protected final int growBy()
+ {
+ return RETURN_SPACE_THRESHOLD; //seems reasonable for a varchar or clob
+ }
}
Modified: incubator/derby/code/trunk/java/engine/org/apache/derby/impl/jdbc/ReaderToUTF8Stream.java
URL: http://svn.apache.org/viewcvs/incubator/derby/code/trunk/java/engine/org/apache/derby/impl/jdbc/ReaderToUTF8Stream.java?rev=190415&r1=190414&r2=190415&view=diff
==============================================================================
--- incubator/derby/code/trunk/java/engine/org/apache/derby/impl/jdbc/ReaderToUTF8Stream.java (original)
+++ incubator/derby/code/trunk/java/engine/org/apache/derby/impl/jdbc/ReaderToUTF8Stream.java Mon Jun 13 08:18:51 2005
@@ -43,11 +43,14 @@
private int blen;
private boolean eof;
private boolean multipleBuffer;
+ // buffer to hold the data read from stream
+ // and converted to UTF8 format
+ private final static int BUFSIZE = 32768;
ReaderToUTF8Stream(LimitReader reader)
{
this.reader = reader;
- buffer = new byte[4096];
+ buffer = new byte[BUFSIZE];
blen = -1;
}
@@ -196,5 +199,20 @@
reader.close();
}
+ /**
+ * Return an optimized version of bytes available to read from
+ * the stream
+ * Note, it is not exactly per java.io.InputStream#available()
+ */
+ public final int available()
+ {
+ int remainingBytes = reader.getLimit();
+ // this object buffers BUFSIZE bytes that can be read
+ // and when that is finished it reads the next available bytes
+ // from the reader object
+ // reader.getLimit() returns the remaining bytes available
+ // on this stream
+ return (BUFSIZE > remainingBytes ? remainingBytes : BUFSIZE);
+ }
}