You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by "Kristian Waagan (JIRA)" <ji...@apache.org> on 2008/11/05 11:41:44 UTC
[jira] Created: (DERBY-3934) Improve performance of reading
modified Clobs
Improve performance of reading modified Clobs
---------------------------------------------
Key: DERBY-3934
URL: https://issues.apache.org/jira/browse/DERBY-3934
Project: Derby
Issue Type: Improvement
Components: JDBC
Affects Versions: 10.5.0.0
Reporter: Kristian Waagan
Assignee: Kristian Waagan
The performance of reading modified Clobs is poor, which is demonstrated by running a test program selecting a 10 MB Clob and then getting the contents using getSubString:
- unmodified Clob (StoreStreamClob) : ~1 300 ms
- modified Clob (TemporaryClob): ~156 000 ms
In this case, the Clob was modified by changing the first character.
A number of subtasks will be created to handle the various issues, which will be related to both performance and code cleanup.
For a brief overview, see http://www.nabble.com/Suggestion-for-improving-ClobUpdatableReader-and-related-code-to20308303.html
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (DERBY-3934) Improve performance of reading
modified Clobs
Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/DERBY-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kristian Waagan updated DERBY-3934:
-----------------------------------
Derby Info: (was: [Patch Available])
Committed patch 2a to trunk with revision 721162.
> Improve performance of reading modified Clobs
> ---------------------------------------------
>
> Key: DERBY-3934
> URL: https://issues.apache.org/jira/browse/DERBY-3934
> Project: Derby
> Issue Type: Improvement
> Components: JDBC
> Affects Versions: 10.5.0.0
> Reporter: Kristian Waagan
> Assignee: Kristian Waagan
> Attachments: derby-3934-1a-clob_replace_test.diff, derby-3934-2a-intclob_new_methods.diff
>
>
> The performance of reading modified Clobs is poor, which is demonstrated by running a test program selecting a 10 MB Clob and then getting the contents using getSubString:
> - unmodified Clob (StoreStreamClob) : ~1 300 ms
> - modified Clob (TemporaryClob): ~156 000 ms
> In this case, the Clob was modified by changing the first character.
> A number of subtasks will be created to handle the various issues, which will be related to both performance and code cleanup.
> For a brief overview, see http://www.nabble.com/Suggestion-for-improving-ClobUpdatableReader-and-related-code-to20308303.html
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (DERBY-3934) Improve performance of reading
modified Clobs
Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/DERBY-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kristian Waagan updated DERBY-3934:
-----------------------------------
Attachment: derby-3934-6a-UTF8Reader_remove_method.diff
Patch 6a removes the now unused method 'readUnsignedShort'. It is no longer used because the reading of the header bytes have been moved outside the reader class.
Committed to trunk with revision 742380.
> Improve performance of reading modified Clobs
> ---------------------------------------------
>
> Key: DERBY-3934
> URL: https://issues.apache.org/jira/browse/DERBY-3934
> Project: Derby
> Issue Type: Improvement
> Components: JDBC
> Affects Versions: 10.5.0.0
> Reporter: Kristian Waagan
> Assignee: Kristian Waagan
> Fix For: 10.5.0.0
>
> Attachments: derby-3934-1a-clob_replace_test.diff, derby-3934-2a-intclob_new_methods.diff, derby-3934-3a-clobupdreader_utf8reader.diff, derby-3934-3a-clobupdreader_utf8reader.stat, derby-3934-4a-getinternalreader_cachedlength.diff, derby-3934-5a-UTF8Reader_cleanup.diff, derby-3934-6a-UTF8Reader_remove_method.diff
>
>
> The performance of reading modified Clobs is poor, which is demonstrated by running a test program selecting a 10 MB Clob and then getting the contents using getSubString:
> - unmodified Clob (StoreStreamClob) : ~1 300 ms
> - modified Clob (TemporaryClob): ~156 000 ms
> In this case, the Clob was modified by changing the first character.
> A number of subtasks will be created to handle the various issues, which will be related to both performance and code cleanup.
> For a brief overview, see http://www.nabble.com/Suggestion-for-improving-ClobUpdatableReader-and-related-code-to20308303.html
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (DERBY-3934) Improve performance of reading
modified Clobs
Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/DERBY-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12654353#action_12654353 ]
Kristian Waagan commented on DERBY-3934:
----------------------------------------
Tests ran without errors.
Committed patch 3a to trunk with revision 724294.
> Improve performance of reading modified Clobs
> ---------------------------------------------
>
> Key: DERBY-3934
> URL: https://issues.apache.org/jira/browse/DERBY-3934
> Project: Derby
> Issue Type: Improvement
> Components: JDBC
> Affects Versions: 10.5.0.0
> Reporter: Kristian Waagan
> Assignee: Kristian Waagan
> Attachments: derby-3934-1a-clob_replace_test.diff, derby-3934-2a-intclob_new_methods.diff, derby-3934-3a-clobupdreader_utf8reader.diff, derby-3934-3a-clobupdreader_utf8reader.stat
>
>
> The performance of reading modified Clobs is poor, which is demonstrated by running a test program selecting a 10 MB Clob and then getting the contents using getSubString:
> - unmodified Clob (StoreStreamClob) : ~1 300 ms
> - modified Clob (TemporaryClob): ~156 000 ms
> In this case, the Clob was modified by changing the first character.
> A number of subtasks will be created to handle the various issues, which will be related to both performance and code cleanup.
> For a brief overview, see http://www.nabble.com/Suggestion-for-improving-ClobUpdatableReader-and-related-code-to20308303.html
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (DERBY-3934) Improve performance of reading
modified Clobs
Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/DERBY-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kristian Waagan updated DERBY-3934:
-----------------------------------
Derby Info: [Patch Available]
> Improve performance of reading modified Clobs
> ---------------------------------------------
>
> Key: DERBY-3934
> URL: https://issues.apache.org/jira/browse/DERBY-3934
> Project: Derby
> Issue Type: Improvement
> Components: JDBC
> Affects Versions: 10.5.0.0
> Reporter: Kristian Waagan
> Assignee: Kristian Waagan
> Attachments: derby-3934-1a-clob_replace_test.diff
>
>
> The performance of reading modified Clobs is poor, which is demonstrated by running a test program selecting a 10 MB Clob and then getting the contents using getSubString:
> - unmodified Clob (StoreStreamClob) : ~1 300 ms
> - modified Clob (TemporaryClob): ~156 000 ms
> In this case, the Clob was modified by changing the first character.
> A number of subtasks will be created to handle the various issues, which will be related to both performance and code cleanup.
> For a brief overview, see http://www.nabble.com/Suggestion-for-improving-ClobUpdatableReader-and-related-code-to20308303.html
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (DERBY-3934) Improve performance of reading
modified Clobs
Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/DERBY-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kristian Waagan updated DERBY-3934:
-----------------------------------
Attachment: derby-3934-2a-intclob_new_methods.diff
'derby-3934-2a-intclob_new_methods.diff' adds two new methods to the InternalClob interface; getUpdateCount and isReleased.
They can be used to determine if the content of a Clob has been changed, and if the internal Clob representation has been changed or the Clob has been closed.
The isReleased() method will be used to detect when a read-only Clob representation is changed to a writable representation due to a user's request to modify the content (i.e. setString).
The code is currently not being used, but it will be by the new implementation of ClobUpdatableReader (not yet posted - I do have a prototype if anyone is interested).
Patch ready for review.
> Improve performance of reading modified Clobs
> ---------------------------------------------
>
> Key: DERBY-3934
> URL: https://issues.apache.org/jira/browse/DERBY-3934
> Project: Derby
> Issue Type: Improvement
> Components: JDBC
> Affects Versions: 10.5.0.0
> Reporter: Kristian Waagan
> Assignee: Kristian Waagan
> Attachments: derby-3934-1a-clob_replace_test.diff, derby-3934-2a-intclob_new_methods.diff
>
>
> The performance of reading modified Clobs is poor, which is demonstrated by running a test program selecting a 10 MB Clob and then getting the contents using getSubString:
> - unmodified Clob (StoreStreamClob) : ~1 300 ms
> - modified Clob (TemporaryClob): ~156 000 ms
> In this case, the Clob was modified by changing the first character.
> A number of subtasks will be created to handle the various issues, which will be related to both performance and code cleanup.
> For a brief overview, see http://www.nabble.com/Suggestion-for-improving-ClobUpdatableReader-and-related-code-to20308303.html
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (DERBY-3934) Improve performance of reading
modified Clobs
Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/DERBY-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kristian Waagan updated DERBY-3934:
-----------------------------------
Derby Info: (was: [Patch Available])
Regression tests for patch 5a ran without failures.
Committed to trunk with revision 736000.
> Improve performance of reading modified Clobs
> ---------------------------------------------
>
> Key: DERBY-3934
> URL: https://issues.apache.org/jira/browse/DERBY-3934
> Project: Derby
> Issue Type: Improvement
> Components: JDBC
> Affects Versions: 10.5.0.0
> Reporter: Kristian Waagan
> Assignee: Kristian Waagan
> Fix For: 10.5.0.0
>
> Attachments: derby-3934-1a-clob_replace_test.diff, derby-3934-2a-intclob_new_methods.diff, derby-3934-3a-clobupdreader_utf8reader.diff, derby-3934-3a-clobupdreader_utf8reader.stat, derby-3934-4a-getinternalreader_cachedlength.diff, derby-3934-5a-UTF8Reader_cleanup.diff
>
>
> The performance of reading modified Clobs is poor, which is demonstrated by running a test program selecting a 10 MB Clob and then getting the contents using getSubString:
> - unmodified Clob (StoreStreamClob) : ~1 300 ms
> - modified Clob (TemporaryClob): ~156 000 ms
> In this case, the Clob was modified by changing the first character.
> A number of subtasks will be created to handle the various issues, which will be related to both performance and code cleanup.
> For a brief overview, see http://www.nabble.com/Suggestion-for-improving-ClobUpdatableReader-and-related-code-to20308303.html
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Closed: (DERBY-3934) Improve performance of reading modified
Clobs
Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/DERBY-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kristian Waagan closed DERBY-3934.
----------------------------------
Resolution: Fixed
Closing this issue.
Some performance results (embedded only) can be seen here: http://www.nabble.com/CLOB-performance-td21831259.html#a21831259
The most relevant result is testFetchLargeClobPieceByPieceModified, followed by testFetchLargeClobOneByOneCharModified and testFetchLargeClobsModified.
There's a small performance hit for the testFetchLargeClobsModified, this may be because we create update sensitive streams unconditionally and that UTF8Reader has been equipped with more functionality. However, the changes to UTF8Reader are required to obtain acceptable performance with the client driver, which always fetches CLOBs piece by piece (through a callable statement using Clob.getSubString).
> Improve performance of reading modified Clobs
> ---------------------------------------------
>
> Key: DERBY-3934
> URL: https://issues.apache.org/jira/browse/DERBY-3934
> Project: Derby
> Issue Type: Improvement
> Components: JDBC
> Affects Versions: 10.5.0.0
> Reporter: Kristian Waagan
> Assignee: Kristian Waagan
> Fix For: 10.5.0.0
>
> Attachments: derby-3934-1a-clob_replace_test.diff, derby-3934-2a-intclob_new_methods.diff, derby-3934-3a-clobupdreader_utf8reader.diff, derby-3934-3a-clobupdreader_utf8reader.stat, derby-3934-4a-getinternalreader_cachedlength.diff, derby-3934-5a-UTF8Reader_cleanup.diff, derby-3934-6a-UTF8Reader_remove_method.diff
>
>
> The performance of reading modified Clobs is poor, which is demonstrated by running a test program selecting a 10 MB Clob and then getting the contents using getSubString:
> - unmodified Clob (StoreStreamClob) : ~1 300 ms
> - modified Clob (TemporaryClob): ~156 000 ms
> In this case, the Clob was modified by changing the first character.
> A number of subtasks will be created to handle the various issues, which will be related to both performance and code cleanup.
> For a brief overview, see http://www.nabble.com/Suggestion-for-improving-ClobUpdatableReader-and-related-code-to20308303.html
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (DERBY-3934) Improve performance of reading
modified Clobs
Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/DERBY-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kristian Waagan updated DERBY-3934:
-----------------------------------
Fix Version/s: 10.5.0.0
Committed patch 4a to trunk with revision 726683.
Work remaining:
o See if a byte/char position cache can be used [more efficiently] to improve repositioning performance.
o Cleaning up after the changes in UTF8Reader.
> Improve performance of reading modified Clobs
> ---------------------------------------------
>
> Key: DERBY-3934
> URL: https://issues.apache.org/jira/browse/DERBY-3934
> Project: Derby
> Issue Type: Improvement
> Components: JDBC
> Affects Versions: 10.5.0.0
> Reporter: Kristian Waagan
> Assignee: Kristian Waagan
> Fix For: 10.5.0.0
>
> Attachments: derby-3934-1a-clob_replace_test.diff, derby-3934-2a-intclob_new_methods.diff, derby-3934-3a-clobupdreader_utf8reader.diff, derby-3934-3a-clobupdreader_utf8reader.stat, derby-3934-4a-getinternalreader_cachedlength.diff
>
>
> The performance of reading modified Clobs is poor, which is demonstrated by running a test program selecting a 10 MB Clob and then getting the contents using getSubString:
> - unmodified Clob (StoreStreamClob) : ~1 300 ms
> - modified Clob (TemporaryClob): ~156 000 ms
> In this case, the Clob was modified by changing the first character.
> A number of subtasks will be created to handle the various issues, which will be related to both performance and code cleanup.
> For a brief overview, see http://www.nabble.com/Suggestion-for-improving-ClobUpdatableReader-and-related-code-to20308303.html
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (DERBY-3934) Improve performance of reading
modified Clobs
Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/DERBY-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kristian Waagan updated DERBY-3934:
-----------------------------------
Derby Info: [Patch Available]
> Improve performance of reading modified Clobs
> ---------------------------------------------
>
> Key: DERBY-3934
> URL: https://issues.apache.org/jira/browse/DERBY-3934
> Project: Derby
> Issue Type: Improvement
> Components: JDBC
> Affects Versions: 10.5.0.0
> Reporter: Kristian Waagan
> Assignee: Kristian Waagan
> Fix For: 10.5.0.0
>
> Attachments: derby-3934-1a-clob_replace_test.diff, derby-3934-2a-intclob_new_methods.diff, derby-3934-3a-clobupdreader_utf8reader.diff, derby-3934-3a-clobupdreader_utf8reader.stat, derby-3934-4a-getinternalreader_cachedlength.diff, derby-3934-5a-UTF8Reader_cleanup.diff
>
>
> The performance of reading modified Clobs is poor, which is demonstrated by running a test program selecting a 10 MB Clob and then getting the contents using getSubString:
> - unmodified Clob (StoreStreamClob) : ~1 300 ms
> - modified Clob (TemporaryClob): ~156 000 ms
> In this case, the Clob was modified by changing the first character.
> A number of subtasks will be created to handle the various issues, which will be related to both performance and code cleanup.
> For a brief overview, see http://www.nabble.com/Suggestion-for-improving-ClobUpdatableReader-and-related-code-to20308303.html
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (DERBY-3934) Improve performance of reading
modified Clobs
Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/DERBY-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12650900#action_12650900 ]
Kristian Waagan commented on DERBY-3934:
----------------------------------------
Committed patch 1a to trunk with revision 720767.
> Improve performance of reading modified Clobs
> ---------------------------------------------
>
> Key: DERBY-3934
> URL: https://issues.apache.org/jira/browse/DERBY-3934
> Project: Derby
> Issue Type: Improvement
> Components: JDBC
> Affects Versions: 10.5.0.0
> Reporter: Kristian Waagan
> Assignee: Kristian Waagan
> Attachments: derby-3934-1a-clob_replace_test.diff, derby-3934-2a-intclob_new_methods.diff
>
>
> The performance of reading modified Clobs is poor, which is demonstrated by running a test program selecting a 10 MB Clob and then getting the contents using getSubString:
> - unmodified Clob (StoreStreamClob) : ~1 300 ms
> - modified Clob (TemporaryClob): ~156 000 ms
> In this case, the Clob was modified by changing the first character.
> A number of subtasks will be created to handle the various issues, which will be related to both performance and code cleanup.
> For a brief overview, see http://www.nabble.com/Suggestion-for-improving-ClobUpdatableReader-and-related-code-to20308303.html
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (DERBY-3934) Improve performance of reading
modified Clobs
Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/DERBY-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kristian Waagan updated DERBY-3934:
-----------------------------------
Attachment: derby-3934-4a-getinternalreader_cachedlength.diff
Patch 'derby-3934-4a-getinternalreader_cachedlength.diff' improves the performance of Clob.getSubString when the Clob is represented by a TemporaryClob.
The main change in the patch is the implementation of TemporaryClob.getInternalReader(long).
The second important change is that TemporaryClob now caches the character length. When the contents are changed, the length is either reset or updated accordingly if possible.
I also added some length to verify that the length is correctly updated on Clob modifications.
On my machine I see the following durations (one iteration) for ClobAccessTest.testFetchLargeClobPieceByPieceModified:
trunk ~76'000 ms
with patch 4a ~600 ms
So with the patch the test completes in less than 1% of the time used by trunk. This is saved CPU time.
Regressing tests are running.
Patch ready for review.
> Improve performance of reading modified Clobs
> ---------------------------------------------
>
> Key: DERBY-3934
> URL: https://issues.apache.org/jira/browse/DERBY-3934
> Project: Derby
> Issue Type: Improvement
> Components: JDBC
> Affects Versions: 10.5.0.0
> Reporter: Kristian Waagan
> Assignee: Kristian Waagan
> Attachments: derby-3934-1a-clob_replace_test.diff, derby-3934-2a-intclob_new_methods.diff, derby-3934-3a-clobupdreader_utf8reader.diff, derby-3934-3a-clobupdreader_utf8reader.stat, derby-3934-4a-getinternalreader_cachedlength.diff
>
>
> The performance of reading modified Clobs is poor, which is demonstrated by running a test program selecting a 10 MB Clob and then getting the contents using getSubString:
> - unmodified Clob (StoreStreamClob) : ~1 300 ms
> - modified Clob (TemporaryClob): ~156 000 ms
> In this case, the Clob was modified by changing the first character.
> A number of subtasks will be created to handle the various issues, which will be related to both performance and code cleanup.
> For a brief overview, see http://www.nabble.com/Suggestion-for-improving-ClobUpdatableReader-and-related-code-to20308303.html
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (DERBY-3934) Improve performance of reading
modified Clobs
Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/DERBY-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kristian Waagan updated DERBY-3934:
-----------------------------------
Attachment: derby-3934-3a-clobupdreader_utf8reader.stat
derby-3934-3a-clobupdreader_utf8reader.diff
'derby-3934-3a-clobupdreader_utf8reader.diff' makes the handling of
StoreStreamClob and TemporaryClob consistent.
The following files are touched (all in derby.impl.jdbc):
*** EmbedClob.
Updated call to ClobUpdatableReader. The change of the position argument is
intentional.
*** TemporaryClob
Replaced the ClobUpdatableReader returned by getReader with a UTF8Reader.
Internal handling of TemporaryClob should deal with changing contents
specifically, or create a ClobUpdatableReader where required.
Note also the use of the new CharacterStreamDescriptor class. This piece of
code will probably be changed later on, when there is more information about
the stream available. For instance, caching byte/char positions allows to skip
directly to the byte position through the underlying file API. This way, we
don't have to decode all the raw bytes to skip the correct number of chars.
*** ClobUpdatableReader
More or less rewritten. It now uses the new methods exposed by InternalClob to
detect changes in the underlying Clob content. Note that this class doesn't
handle repositioning, only detection of changes and forwarding of read/skip
calls.
Note the lazy initialization of the underlying reader.
WARNING: There is one thing missing, which is proper synchronization. Access to
store will be synchronized in other locations, but this class is not thread
safe. I haven't decided yet whether to synchronize on the reader object or the
root connection. I think the latter is the best choice. Does anyone know
anything about the cost of taking locks on the same object multiple times?
*** StoreStreamClob
Replaced old UTF8Reader constructor with the new one. Again, this code needs
to be updated when more information about the stream is available. This is to
allow UTF8Reader to perform better.
*** UTF8Reader
Added a new constructor, using the new CharacterStreamDescriptor class.
Removed one constructor.
Retrofitted the second old constructor to use CharacterStreamDescriptor. This
will be removed when the calling code has been updated.
The old method calculating the buffer size will also be removed.
Stopped referencing PositionedStoreStream, using PositionedStream interface
instead. This allows the positioning logic to be used for both store streams
and LOBInputStream streams.
The reader has been prepared to be able to deal with multiple data offsets,
i.e. handling several store stream formats. For instance, the current
implementations has an offset of two bytes, where as the planned new one will
have an offset of at least five bytes. LOBInputStream has an offset of zero
bytes (no header information).
From now on, position aware streams are not closed as early as before, because
we might have go backwards in the stream. Streams that can only move forwards
are closed as soon as possible (as before).
Tests are running, and about 3/4 finished. No errors so far. I will post final
results later.
Patch ready for review.
The plan forwards
-----------------
After patch 3a is in, I plan to do the following;
1) Implement TemporaryClob.getInternalReader().
This will dramatically improve the Clob.getSubString performance for
modified Clobs.
2) I will consider adding a simple byte/char position cache.
The point of this is to be able to skip to a given byte position without
having to decode byte into chars. This is a mechanism that will only help
certain access patterns, but it should come with a very low overhead.
3) Continue working with the new Clob format.
When it is in place, care must be taken to utilize the new steam
information where possible. The primary one is returning the length through
Clob.length(). A second opportunity is using the length information to take
decisions in the byte/char position cache.
This work is mostly related to DERBY-3907.
I'm using the simple Clob regression tests in my work, and it has already
revealed a bug :) I had forgotten to include the know byte length in the
CharacterStreamDescription, which caused UTF8Reader to allocate a buffer that
was way too big (8K instead of 100 bytes).
The last step in my LOB work will be to write a simple report documenting the
improvements.
> Improve performance of reading modified Clobs
> ---------------------------------------------
>
> Key: DERBY-3934
> URL: https://issues.apache.org/jira/browse/DERBY-3934
> Project: Derby
> Issue Type: Improvement
> Components: JDBC
> Affects Versions: 10.5.0.0
> Reporter: Kristian Waagan
> Assignee: Kristian Waagan
> Attachments: derby-3934-1a-clob_replace_test.diff, derby-3934-2a-intclob_new_methods.diff, derby-3934-3a-clobupdreader_utf8reader.diff, derby-3934-3a-clobupdreader_utf8reader.stat
>
>
> The performance of reading modified Clobs is poor, which is demonstrated by running a test program selecting a 10 MB Clob and then getting the contents using getSubString:
> - unmodified Clob (StoreStreamClob) : ~1 300 ms
> - modified Clob (TemporaryClob): ~156 000 ms
> In this case, the Clob was modified by changing the first character.
> A number of subtasks will be created to handle the various issues, which will be related to both performance and code cleanup.
> For a brief overview, see http://www.nabble.com/Suggestion-for-improving-ClobUpdatableReader-and-related-code-to20308303.html
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (DERBY-3934) Improve performance of reading
modified Clobs
Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/DERBY-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kristian Waagan updated DERBY-3934:
-----------------------------------
Attachment: derby-3934-1a-clob_replace_test.diff
'derby-3934-1a-clob_replace_test.diff' adds two simple tests where the characters of a Clob containing multi-byte chars are replaced one by one - forwards and backwards.
This test can detect errors in the positioning logic and incorrect offsets/positions in the internal stream descriptors.
Patch ready for review.
I will commit this patch tomorrow.
> Improve performance of reading modified Clobs
> ---------------------------------------------
>
> Key: DERBY-3934
> URL: https://issues.apache.org/jira/browse/DERBY-3934
> Project: Derby
> Issue Type: Improvement
> Components: JDBC
> Affects Versions: 10.5.0.0
> Reporter: Kristian Waagan
> Assignee: Kristian Waagan
> Attachments: derby-3934-1a-clob_replace_test.diff
>
>
> The performance of reading modified Clobs is poor, which is demonstrated by running a test program selecting a 10 MB Clob and then getting the contents using getSubString:
> - unmodified Clob (StoreStreamClob) : ~1 300 ms
> - modified Clob (TemporaryClob): ~156 000 ms
> In this case, the Clob was modified by changing the first character.
> A number of subtasks will be created to handle the various issues, which will be related to both performance and code cleanup.
> For a brief overview, see http://www.nabble.com/Suggestion-for-improving-ClobUpdatableReader-and-related-code-to20308303.html
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (DERBY-3934) Improve performance of reading
modified Clobs
Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/DERBY-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kristian Waagan updated DERBY-3934:
-----------------------------------
Attachment: derby-3934-5a-UTF8Reader_cleanup.diff
Patch 5a removes a constructor and a method called only from that constructor.
Updated the code in EmbedCallableStatement20 to use the new constructor.
Running regression tests, will commit tomorrow if the tests pass.
> Improve performance of reading modified Clobs
> ---------------------------------------------
>
> Key: DERBY-3934
> URL: https://issues.apache.org/jira/browse/DERBY-3934
> Project: Derby
> Issue Type: Improvement
> Components: JDBC
> Affects Versions: 10.5.0.0
> Reporter: Kristian Waagan
> Assignee: Kristian Waagan
> Fix For: 10.5.0.0
>
> Attachments: derby-3934-1a-clob_replace_test.diff, derby-3934-2a-intclob_new_methods.diff, derby-3934-3a-clobupdreader_utf8reader.diff, derby-3934-3a-clobupdreader_utf8reader.stat, derby-3934-4a-getinternalreader_cachedlength.diff, derby-3934-5a-UTF8Reader_cleanup.diff
>
>
> The performance of reading modified Clobs is poor, which is demonstrated by running a test program selecting a 10 MB Clob and then getting the contents using getSubString:
> - unmodified Clob (StoreStreamClob) : ~1 300 ms
> - modified Clob (TemporaryClob): ~156 000 ms
> In this case, the Clob was modified by changing the first character.
> A number of subtasks will be created to handle the various issues, which will be related to both performance and code cleanup.
> For a brief overview, see http://www.nabble.com/Suggestion-for-improving-ClobUpdatableReader-and-related-code-to20308303.html
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.