You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by "Kristian Waagan (JIRA)" <ji...@apache.org> on 2008/08/06 16:20:44 UTC

[jira] Created: (DERBY-3825) StoreStreamClob.getReader(charPos) performs poorly

StoreStreamClob.getReader(charPos) performs poorly
--------------------------------------------------

                 Key: DERBY-3825
                 URL: https://issues.apache.org/jira/browse/DERBY-3825
             Project: Derby
          Issue Type: Bug
          Components: JDBC, Store
    Affects Versions: 10.5.0.0
            Reporter: Kristian Waagan


StoreStreamClob.getReader(charPos) performs poorly because it resets the underlying stream and skips data until it reached the requested character position. Not only does the data has to be skipped, it also has to be decoded (UTF-8).
The problem is exposed through EmbedClob.getSubString, which causes extremely bad performance for the client driver because the locator based Clob implementation uses this method.

For the record, there is another read buffer size issue that exaggerates the problem (it will probably be handled under DERBY-3769, and also DERBY-3818).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (DERBY-3825) StoreStreamClob.getReader(charPos) performs poorly

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kristian Waagan reassigned DERBY-3825:
--------------------------------------

    Assignee: Kristian Waagan

> StoreStreamClob.getReader(charPos) performs poorly
> --------------------------------------------------
>
>                 Key: DERBY-3825
>                 URL: https://issues.apache.org/jira/browse/DERBY-3825
>             Project: Derby
>          Issue Type: Bug
>          Components: JDBC, Store
>    Affects Versions: 10.5.0.0
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>         Attachments: derby-3825-0a-preview.diff, derby-3825-1a-reset_readpositioninbuffer.diff
>
>
> StoreStreamClob.getReader(charPos) performs poorly because it resets the underlying stream and skips data until it reached the requested character position. Not only does the data has to be skipped, it also has to be decoded (UTF-8).
> The problem is exposed through EmbedClob.getSubString, which causes extremely bad performance for the client driver because the locator based Clob implementation uses this method.
> For the record, there is another read buffer size issue that exaggerates the problem (it will probably be handled under DERBY-3769, and also DERBY-3818).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (DERBY-3825) StoreStreamClob.getReader(charPos) performs poorly

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kristian Waagan updated DERBY-3825:
-----------------------------------

    Derby Info: [Patch Available]

> StoreStreamClob.getReader(charPos) performs poorly
> --------------------------------------------------
>
>                 Key: DERBY-3825
>                 URL: https://issues.apache.org/jira/browse/DERBY-3825
>             Project: Derby
>          Issue Type: Bug
>          Components: JDBC, Store
>    Affects Versions: 10.5.0.0
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>         Attachments: derby-3825-0a-preview.diff, derby-3825-1a-reset_readpositioninbuffer.diff, derby-3825-2a-internalReader_repositioning.diff, derby-3825-2a-internalReader_repositioning.stat
>
>
> StoreStreamClob.getReader(charPos) performs poorly because it resets the underlying stream and skips data until it reached the requested character position. Not only does the data has to be skipped, it also has to be decoded (UTF-8).
> The problem is exposed through EmbedClob.getSubString, which causes extremely bad performance for the client driver because the locator based Clob implementation uses this method.
> For the record, there is another read buffer size issue that exaggerates the problem (it will probably be handled under DERBY-3769, and also DERBY-3818).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (DERBY-3825) StoreStreamClob.getReader(charPos) performs poorly

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kristian Waagan resolved DERBY-3825.
------------------------------------

       Resolution: Fixed
    Fix Version/s: 10.4.2.1

Backported patches 1a, 2b and 3a to 10.4 with revision 710070.
Will close when tinderbox test run has completed.

> StoreStreamClob.getReader(charPos) performs poorly
> --------------------------------------------------
>
>                 Key: DERBY-3825
>                 URL: https://issues.apache.org/jira/browse/DERBY-3825
>             Project: Derby
>          Issue Type: Bug
>          Components: JDBC, Store
>    Affects Versions: 10.5.0.0
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>             Fix For: 10.4.2.1, 10.5.0.0
>
>         Attachments: derby-3825-0a-preview.diff, derby-3825-1a-reset_readpositioninbuffer.diff, derby-3825-2a-internalReader_repositioning.diff, derby-3825-2a-internalReader_repositioning.stat, derby-3825-2b-internalReader_repositioning.diff, derby-3825-3a-simplification.diff
>
>
> StoreStreamClob.getReader(charPos) performs poorly because it resets the underlying stream and skips data until it reached the requested character position. Not only does the data has to be skipped, it also has to be decoded (UTF-8).
> The problem is exposed through EmbedClob.getSubString, which causes extremely bad performance for the client driver because the locator based Clob implementation uses this method.
> For the record, there is another read buffer size issue that exaggerates the problem (it will probably be handled under DERBY-3769, and also DERBY-3818).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (DERBY-3825) StoreStreamClob.getReader(charPos) performs poorly

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kristian Waagan updated DERBY-3825:
-----------------------------------

    Attachment: derby-3825-2a-internalReader_repositioning.stat
                derby-3825-2a-internalReader_repositioning.diff

Patch 2a introduces an internal reader and adds repositioning logic to UTF8Reader.

Note that I have made getInternalReader merely forward to getReader in TemporaryClob.
I will commit the simple Clob regression tests and run it with and without this patch to document the effect.

Regression tests passed (JDK 1.6.0, Solaris 10).
Patch ready for review.

> StoreStreamClob.getReader(charPos) performs poorly
> --------------------------------------------------
>
>                 Key: DERBY-3825
>                 URL: https://issues.apache.org/jira/browse/DERBY-3825
>             Project: Derby
>          Issue Type: Bug
>          Components: JDBC, Store
>    Affects Versions: 10.5.0.0
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>         Attachments: derby-3825-0a-preview.diff, derby-3825-1a-reset_readpositioninbuffer.diff, derby-3825-2a-internalReader_repositioning.diff, derby-3825-2a-internalReader_repositioning.stat
>
>
> StoreStreamClob.getReader(charPos) performs poorly because it resets the underlying stream and skips data until it reached the requested character position. Not only does the data has to be skipped, it also has to be decoded (UTF-8).
> The problem is exposed through EmbedClob.getSubString, which causes extremely bad performance for the client driver because the locator based Clob implementation uses this method.
> For the record, there is another read buffer size issue that exaggerates the problem (it will probably be handled under DERBY-3769, and also DERBY-3818).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (DERBY-3825) StoreStreamClob.getReader(charPos) performs poorly

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637523#action_12637523 ] 

Kristian Waagan commented on DERBY-3825:
----------------------------------------

Just noticed I forgot the license header in the new test. I'll add it in the next rev.

> StoreStreamClob.getReader(charPos) performs poorly
> --------------------------------------------------
>
>                 Key: DERBY-3825
>                 URL: https://issues.apache.org/jira/browse/DERBY-3825
>             Project: Derby
>          Issue Type: Bug
>          Components: JDBC, Store
>    Affects Versions: 10.5.0.0
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>         Attachments: derby-3825-0a-preview.diff, derby-3825-1a-reset_readpositioninbuffer.diff, derby-3825-2a-internalReader_repositioning.diff, derby-3825-2a-internalReader_repositioning.stat
>
>
> StoreStreamClob.getReader(charPos) performs poorly because it resets the underlying stream and skips data until it reached the requested character position. Not only does the data has to be skipped, it also has to be decoded (UTF-8).
> The problem is exposed through EmbedClob.getSubString, which causes extremely bad performance for the client driver because the locator based Clob implementation uses this method.
> For the record, there is another read buffer size issue that exaggerates the problem (it will probably be handled under DERBY-3769, and also DERBY-3818).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (DERBY-3825) StoreStreamClob.getReader(charPos) performs poorly

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kristian Waagan updated DERBY-3825:
-----------------------------------


Committed patch 3a to trunk with revision 710033.

> StoreStreamClob.getReader(charPos) performs poorly
> --------------------------------------------------
>
>                 Key: DERBY-3825
>                 URL: https://issues.apache.org/jira/browse/DERBY-3825
>             Project: Derby
>          Issue Type: Bug
>          Components: JDBC, Store
>    Affects Versions: 10.5.0.0
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>             Fix For: 10.5.0.0
>
>         Attachments: derby-3825-0a-preview.diff, derby-3825-1a-reset_readpositioninbuffer.diff, derby-3825-2a-internalReader_repositioning.diff, derby-3825-2a-internalReader_repositioning.stat, derby-3825-2b-internalReader_repositioning.diff, derby-3825-3a-simplification.diff
>
>
> StoreStreamClob.getReader(charPos) performs poorly because it resets the underlying stream and skips data until it reached the requested character position. Not only does the data has to be skipped, it also has to be decoded (UTF-8).
> The problem is exposed through EmbedClob.getSubString, which causes extremely bad performance for the client driver because the locator based Clob implementation uses this method.
> For the record, there is another read buffer size issue that exaggerates the problem (it will probably be handled under DERBY-3769, and also DERBY-3818).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (DERBY-3825) StoreStreamClob.getReader(charPos) performs poorly

Posted by "Knut Anders Hatlen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12639423#action_12639423 ] 

Knut Anders Hatlen commented on DERBY-3825:
-------------------------------------------

Thanks for the updated patch, Kristian. I think it looks ready for
commit.

> Thanks for looking at the repositioning logic. It is a bit complex,
> but it should be pretty well tested functionally. Can it be
> optimized?

I don't see how it can be optimized without changing the format, since
random access to a lob is currently not supported by the store.

It may perhaps be easier to read it if the call to resetUTF8Reader()
is moved to the beginning of the method. Then there will be just two
cases to consider for the repositioning: the requested position is
either after the current position or in the buffer. This means that we
don't need the nested if statements. Something along these lines:

if (requestedCharPos <= readerCharCount - charactersInBuffer) {
    resetUTF8Reader();
}

long currentCharPos =
    readerCharCount - charactersInBuffer + readPositionInBuffer;

long difference = (requestedCharPos - 1) - currentCharPos;

if (difference <= 0) {
    // move back in the buffer
    readPositionInBuffer += difference;
} else {
    // skip forward
    persistentSkip(difference);
}

> StoreStreamClob.getReader(charPos) performs poorly
> --------------------------------------------------
>
>                 Key: DERBY-3825
>                 URL: https://issues.apache.org/jira/browse/DERBY-3825
>             Project: Derby
>          Issue Type: Bug
>          Components: JDBC, Store
>    Affects Versions: 10.5.0.0
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>         Attachments: derby-3825-0a-preview.diff, derby-3825-1a-reset_readpositioninbuffer.diff, derby-3825-2a-internalReader_repositioning.diff, derby-3825-2a-internalReader_repositioning.stat, derby-3825-2b-internalReader_repositioning.diff
>
>
> StoreStreamClob.getReader(charPos) performs poorly because it resets the underlying stream and skips data until it reached the requested character position. Not only does the data has to be skipped, it also has to be decoded (UTF-8).
> The problem is exposed through EmbedClob.getSubString, which causes extremely bad performance for the client driver because the locator based Clob implementation uses this method.
> For the record, there is another read buffer size issue that exaggerates the problem (it will probably be handled under DERBY-3769, and also DERBY-3818).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (DERBY-3825) StoreStreamClob.getReader(charPos) performs poorly

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kristian Waagan updated DERBY-3825:
-----------------------------------

    Attachment: derby-3825-0a-preview.diff

'derby-3825-0a-preview.diff' is a preview patch. It is incomplete and not for commit.

It introduces a new method for InternalClob: getInternalReader(charPos).
The idea is to keep only one such reader per clob that can be used internally - that is not published to the user. The most prominent example is Clob.getSubString().
There are two performance gains:
 1) Repositioning capabilities (see below).
 2) Less object creation (GC).

The repositioning functionality is added to UTF8Reader, and can be split into three types - ordered after increasing cost:
 a) Reposition within current character buffer (small hops forwards and potentially backwards - in range 1 char to 8K chars)
 b) Forward stream from current position (hops forwards)
 c) Reset stream and skip data (hops backwards)

The more I work with this, the more I feel the functionality should be pushed closer to store.

Preview patch ready for comments.

> StoreStreamClob.getReader(charPos) performs poorly
> --------------------------------------------------
>
>                 Key: DERBY-3825
>                 URL: https://issues.apache.org/jira/browse/DERBY-3825
>             Project: Derby
>          Issue Type: Bug
>          Components: JDBC, Store
>    Affects Versions: 10.5.0.0
>            Reporter: Kristian Waagan
>         Attachments: derby-3825-0a-preview.diff
>
>
> StoreStreamClob.getReader(charPos) performs poorly because it resets the underlying stream and skips data until it reached the requested character position. Not only does the data has to be skipped, it also has to be decoded (UTF-8).
> The problem is exposed through EmbedClob.getSubString, which causes extremely bad performance for the client driver because the locator based Clob implementation uses this method.
> For the record, there is another read buffer size issue that exaggerates the problem (it will probably be handled under DERBY-3769, and also DERBY-3818).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (DERBY-3825) StoreStreamClob.getReader(charPos) performs poorly

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kristian Waagan updated DERBY-3825:
-----------------------------------

    Attachment: derby-3825-3a-simplification.diff

'derby-3825-3a-simplification.diff' makes the repositioning logic easier to read.
Thanks for the suggestion Knut Anders.

Running tests, will commit and backport to 10.4 when the tests have finished.

> StoreStreamClob.getReader(charPos) performs poorly
> --------------------------------------------------
>
>                 Key: DERBY-3825
>                 URL: https://issues.apache.org/jira/browse/DERBY-3825
>             Project: Derby
>          Issue Type: Bug
>          Components: JDBC, Store
>    Affects Versions: 10.5.0.0
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>             Fix For: 10.5.0.0
>
>         Attachments: derby-3825-0a-preview.diff, derby-3825-1a-reset_readpositioninbuffer.diff, derby-3825-2a-internalReader_repositioning.diff, derby-3825-2a-internalReader_repositioning.stat, derby-3825-2b-internalReader_repositioning.diff, derby-3825-3a-simplification.diff
>
>
> StoreStreamClob.getReader(charPos) performs poorly because it resets the underlying stream and skips data until it reached the requested character position. Not only does the data has to be skipped, it also has to be decoded (UTF-8).
> The problem is exposed through EmbedClob.getSubString, which causes extremely bad performance for the client driver because the locator based Clob implementation uses this method.
> For the record, there is another read buffer size issue that exaggerates the problem (it will probably be handled under DERBY-3769, and also DERBY-3818).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (DERBY-3825) StoreStreamClob.getReader(charPos) performs poorly

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kristian Waagan updated DERBY-3825:
-----------------------------------

    Attachment: derby-3825-1a-reset_readpositioninbuffer.diff

'derby-3825-1a-reset_readpositioninbuffer.diff' is a small cleanup patch moving the resetting of the readPositionInBuffer variable into fillBuffer.

Committed to trunk with revision 689803.

> StoreStreamClob.getReader(charPos) performs poorly
> --------------------------------------------------
>
>                 Key: DERBY-3825
>                 URL: https://issues.apache.org/jira/browse/DERBY-3825
>             Project: Derby
>          Issue Type: Bug
>          Components: JDBC, Store
>    Affects Versions: 10.5.0.0
>            Reporter: Kristian Waagan
>         Attachments: derby-3825-0a-preview.diff, derby-3825-1a-reset_readpositioninbuffer.diff
>
>
> StoreStreamClob.getReader(charPos) performs poorly because it resets the underlying stream and skips data until it reached the requested character position. Not only does the data has to be skipped, it also has to be decoded (UTF-8).
> The problem is exposed through EmbedClob.getSubString, which causes extremely bad performance for the client driver because the locator based Clob implementation uses this method.
> For the record, there is another read buffer size issue that exaggerates the problem (it will probably be handled under DERBY-3769, and also DERBY-3818).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (DERBY-3825) StoreStreamClob.getReader(charPos) performs poorly

Posted by "Ole Solberg (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12644928#action_12644928 ] 

Ole Solberg commented on DERBY-3825:
------------------------------------

10.4 Tinderbox build 
http://dbtg.thresher.com/derby/test/tinderbox_10.4_16/UpdateInfo/710076-buildDetails.txt
reports:


pptesting:

compile:
    [mkdir] Created dir: Apache/TinderBox-10.4/10.4/classes.pptesting
    [javac] Compiling 11 source files to Apache/TinderBox-10.4/10.4/classes.pptesting
    [javac] Apache/TinderBox-10.4/10.4/java/testing/org/apache/derby/impl/jdbc/UTF8ReaderTest.java:57: cannot find symbol
    [javac] symbol  : method setAutoCommit(boolean)
    [javac] location: class org.apache.derby.impl.jdbc.UTF8ReaderTest
    [javac]         setAutoCommit(false);
    [javac]         ^
    [javac] Apache/TinderBox-10.4/10.4/java/testing/org/apache/derby/impl/jdbc/UTF8ReaderTest.java:88: cannot find symbol
    [javac] symbol  : method setAutoCommit(boolean)
    [javac] location: class org.apache.derby.impl.jdbc.UTF8ReaderTest
    [javac]         setAutoCommit(false);
    [javac]         ^
    [javac] Apache/TinderBox-10.4/10.4/java/testing/org/apache/derby/impl/jdbc/UTF8ReaderTest.java:126: cannot find symbol
    [javac] symbol  : method setAutoCommit(boolean)
    [javac] location: class org.apache.derby.impl.jdbc.UTF8ReaderTest
    [javac]         setAutoCommit(false);
    [javac]         ^
    [javac] Note: Apache/TinderBox-10.4/10.4/java/testing/org/apache/derby/impl/drda/TestProto.java uses unchecked or unsafe operations.
    [javac] Note: Recompile with -Xlint:unchecked for details.
    [javac] 3 errors

BUILD FAILED
Apache/TinderBox-10.4/10.4/build.xml:457: The following error occurred while executing this line:
Apache/TinderBox-10.4/10.4/java/testing/org/apache/derby/build.xml:44: Compile failed; see the compiler error output for details.


Dittot on 10.4 branch build last night: 

> StoreStreamClob.getReader(charPos) performs poorly
> --------------------------------------------------
>
>                 Key: DERBY-3825
>                 URL: https://issues.apache.org/jira/browse/DERBY-3825
>             Project: Derby
>          Issue Type: Bug
>          Components: JDBC, Store
>    Affects Versions: 10.5.0.0
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>             Fix For: 10.4.2.1, 10.5.0.0
>
>         Attachments: derby-3825-0a-preview.diff, derby-3825-1a-reset_readpositioninbuffer.diff, derby-3825-2a-internalReader_repositioning.diff, derby-3825-2a-internalReader_repositioning.stat, derby-3825-2b-internalReader_repositioning.diff, derby-3825-3a-simplification.diff
>
>
> StoreStreamClob.getReader(charPos) performs poorly because it resets the underlying stream and skips data until it reached the requested character position. Not only does the data has to be skipped, it also has to be decoded (UTF-8).
> The problem is exposed through EmbedClob.getSubString, which causes extremely bad performance for the client driver because the locator based Clob implementation uses this method.
> For the record, there is another read buffer size issue that exaggerates the problem (it will probably be handled under DERBY-3769, and also DERBY-3818).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Closed: (DERBY-3825) StoreStreamClob.getReader(charPos) performs poorly

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kristian Waagan closed DERBY-3825.
----------------------------------


Closing the issue.

Note that getReader is still rather low-performant, compared to what it could have been. Again the cause is positioning. The trick I used for getInternalReader cannot be used, because the reader from getReader is passed out to the user and cannot be shared.
Given the current requirements and limitations, a possible optimization is to remember one or more byte/char position and then be able to skip bytes instead of chars. The latter requires decoding, the former doesn't.

On the other side, if a user just obtains one reader and keeps reading from it, the initial positioning cost doesn't matter that much.
The problem with getSubString has been resolved by the patches committed.

> StoreStreamClob.getReader(charPos) performs poorly
> --------------------------------------------------
>
>                 Key: DERBY-3825
>                 URL: https://issues.apache.org/jira/browse/DERBY-3825
>             Project: Derby
>          Issue Type: Bug
>          Components: JDBC, Store
>    Affects Versions: 10.5.0.0
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>             Fix For: 10.4.2.1, 10.5.0.0
>
>         Attachments: derby-3825-0a-preview.diff, derby-3825-1a-reset_readpositioninbuffer.diff, derby-3825-2a-internalReader_repositioning.diff, derby-3825-2a-internalReader_repositioning.stat, derby-3825-2b-internalReader_repositioning.diff, derby-3825-3a-simplification.diff
>
>
> StoreStreamClob.getReader(charPos) performs poorly because it resets the underlying stream and skips data until it reached the requested character position. Not only does the data has to be skipped, it also has to be decoded (UTF-8).
> The problem is exposed through EmbedClob.getSubString, which causes extremely bad performance for the client driver because the locator based Clob implementation uses this method.
> For the record, there is another read buffer size issue that exaggerates the problem (it will probably be handled under DERBY-3769, and also DERBY-3818).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (DERBY-3825) StoreStreamClob.getReader(charPos) performs poorly

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kristian Waagan updated DERBY-3825:
-----------------------------------

       Derby Info:   (was: [Patch Available])
    Fix Version/s: 10.5.0.0

Thanks again, Knut Anders.

I committed patch 2b to trunk with revision 704547, and will look at the suggested simplification(s) in a separate patch.

> StoreStreamClob.getReader(charPos) performs poorly
> --------------------------------------------------
>
>                 Key: DERBY-3825
>                 URL: https://issues.apache.org/jira/browse/DERBY-3825
>             Project: Derby
>          Issue Type: Bug
>          Components: JDBC, Store
>    Affects Versions: 10.5.0.0
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>             Fix For: 10.5.0.0
>
>         Attachments: derby-3825-0a-preview.diff, derby-3825-1a-reset_readpositioninbuffer.diff, derby-3825-2a-internalReader_repositioning.diff, derby-3825-2a-internalReader_repositioning.stat, derby-3825-2b-internalReader_repositioning.diff
>
>
> StoreStreamClob.getReader(charPos) performs poorly because it resets the underlying stream and skips data until it reached the requested character position. Not only does the data has to be skipped, it also has to be decoded (UTF-8).
> The problem is exposed through EmbedClob.getSubString, which causes extremely bad performance for the client driver because the locator based Clob implementation uses this method.
> For the record, there is another read buffer size issue that exaggerates the problem (it will probably be handled under DERBY-3769, and also DERBY-3818).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (DERBY-3825) StoreStreamClob.getReader(charPos) performs poorly

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12644941#action_12644941 ] 

Kristian Waagan commented on DERBY-3825:
----------------------------------------

Thanks for letting me know, Ole.

I'm not sure how this slipped though, guess I have to look at my test-deployment scripts...
Committed fix to 10.4 with revision 711221.

I'm sorry for the noise.

> StoreStreamClob.getReader(charPos) performs poorly
> --------------------------------------------------
>
>                 Key: DERBY-3825
>                 URL: https://issues.apache.org/jira/browse/DERBY-3825
>             Project: Derby
>          Issue Type: Bug
>          Components: JDBC, Store
>    Affects Versions: 10.5.0.0
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>             Fix For: 10.4.2.1, 10.5.0.0
>
>         Attachments: derby-3825-0a-preview.diff, derby-3825-1a-reset_readpositioninbuffer.diff, derby-3825-2a-internalReader_repositioning.diff, derby-3825-2a-internalReader_repositioning.stat, derby-3825-2b-internalReader_repositioning.diff, derby-3825-3a-simplification.diff
>
>
> StoreStreamClob.getReader(charPos) performs poorly because it resets the underlying stream and skips data until it reached the requested character position. Not only does the data has to be skipped, it also has to be decoded (UTF-8).
> The problem is exposed through EmbedClob.getSubString, which causes extremely bad performance for the client driver because the locator based Clob implementation uses this method.
> For the record, there is another read buffer size issue that exaggerates the problem (it will probably be handled under DERBY-3769, and also DERBY-3818).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (DERBY-3825) StoreStreamClob.getReader(charPos) performs poorly

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kristian Waagan updated DERBY-3825:
-----------------------------------

    Attachment: derby-3825-2b-internalReader_repositioning.diff

Patch 2b addresses the following:
 - added license to UTF8ReaderTest
 - changed class comment in UTF8ReaderTest
 - added DEBUG block with detailed EOFException

Regarding Knut Anders' comments (from top to bottom):
 - Thanks for looking at the repositioning logic. It is a bit complex, but it should be pretty well tested functionally. Can it be optimized?
 - Regarding the poor performance you observed, I need to look into it. I will create a new Jira for it. Implementing getInternalReader is one issue, but I think there are more severe problems in this code path (it is using some special reader objects). Hopefully the ClobAccessTest should demonstrate the problem.
 - I resolved the localization issue by throwing a detailed EOFException from inside a debug block. I chose this over an assert, as an assert makes a test fail (expects EOFException).
 - I changed the comment, it was not very precise. I need a package-private test because of StoreStreamClob, which is the only user of UTF8Reader.reposition at the moment and I wanted a little bit more control than what I get going through the JDBC API.

Thanks for looking at the patch!
Unless there are more comments on patch 2b, I expect to commit it shortly.

> StoreStreamClob.getReader(charPos) performs poorly
> --------------------------------------------------
>
>                 Key: DERBY-3825
>                 URL: https://issues.apache.org/jira/browse/DERBY-3825
>             Project: Derby
>          Issue Type: Bug
>          Components: JDBC, Store
>    Affects Versions: 10.5.0.0
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>         Attachments: derby-3825-0a-preview.diff, derby-3825-1a-reset_readpositioninbuffer.diff, derby-3825-2a-internalReader_repositioning.diff, derby-3825-2a-internalReader_repositioning.stat, derby-3825-2b-internalReader_repositioning.diff
>
>
> StoreStreamClob.getReader(charPos) performs poorly because it resets the underlying stream and skips data until it reached the requested character position. Not only does the data has to be skipped, it also has to be decoded (UTF-8).
> The problem is exposed through EmbedClob.getSubString, which causes extremely bad performance for the client driver because the locator based Clob implementation uses this method.
> For the record, there is another read buffer size issue that exaggerates the problem (it will probably be handled under DERBY-3769, and also DERBY-3818).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (DERBY-3825) StoreStreamClob.getReader(charPos) performs poorly

Posted by "Knut Anders Hatlen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12639064#action_12639064 ] 

Knut Anders Hatlen commented on DERBY-3825:
-------------------------------------------

I read through the patch, and it looks correct to me. The repositioning logic was rather complex, mainly because of the buffering and the mix of 1-based indexes on the JDBC level and 0-based indexes on the buffer level, but it looks to me as if all cases are handled correctly.

I have also tested the patch by fetching a 32 MB CLOB with getSubString(). Without the patch, it took so long time that I hit Ctrl-C in the end (waited for minutes). With the patch, it took 3-4 seconds to fetch the CLOB. So the patch seems to work very well! This was with a CLOB stored in a table. When I tested the same with a CLOB created by Connection.createClob(), I still observed that getSubString() took a very long time. I suppose this is what's meant by the TODO comment in TemporaryClob?

I noticed a TODO that suggested that localization was going to be added in UTF8Reader.skipPersistent(). Is this planned in a follow-up patch? Other places in the code, we just throw an EOFException with no message in similar situations. Perhaps the detailed error message could be put in a THROWASSERT() so that it is only used in debug builds?

The class javadoc for UTF8ReaderTest says that it tests "package-private methods in {@code UTF8Reader}." As far as I can see, it only tests the public methods of UTF8Reader.

> StoreStreamClob.getReader(charPos) performs poorly
> --------------------------------------------------
>
>                 Key: DERBY-3825
>                 URL: https://issues.apache.org/jira/browse/DERBY-3825
>             Project: Derby
>          Issue Type: Bug
>          Components: JDBC, Store
>    Affects Versions: 10.5.0.0
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>         Attachments: derby-3825-0a-preview.diff, derby-3825-1a-reset_readpositioninbuffer.diff, derby-3825-2a-internalReader_repositioning.diff, derby-3825-2a-internalReader_repositioning.stat
>
>
> StoreStreamClob.getReader(charPos) performs poorly because it resets the underlying stream and skips data until it reached the requested character position. Not only does the data has to be skipped, it also has to be decoded (UTF-8).
> The problem is exposed through EmbedClob.getSubString, which causes extremely bad performance for the client driver because the locator based Clob implementation uses this method.
> For the record, there is another read buffer size issue that exaggerates the problem (it will probably be handled under DERBY-3769, and also DERBY-3818).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (DERBY-3825) StoreStreamClob.getReader(charPos) performs poorly

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kristian Waagan updated DERBY-3825:
-----------------------------------

    Derby Info:   (was: [Patch Available])

> StoreStreamClob.getReader(charPos) performs poorly
> --------------------------------------------------
>
>                 Key: DERBY-3825
>                 URL: https://issues.apache.org/jira/browse/DERBY-3825
>             Project: Derby
>          Issue Type: Bug
>          Components: JDBC, Store
>    Affects Versions: 10.5.0.0
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>         Attachments: derby-3825-0a-preview.diff, derby-3825-1a-reset_readpositioninbuffer.diff
>
>
> StoreStreamClob.getReader(charPos) performs poorly because it resets the underlying stream and skips data until it reached the requested character position. Not only does the data has to be skipped, it also has to be decoded (UTF-8).
> The problem is exposed through EmbedClob.getSubString, which causes extremely bad performance for the client driver because the locator based Clob implementation uses this method.
> For the record, there is another read buffer size issue that exaggerates the problem (it will probably be handled under DERBY-3769, and also DERBY-3818).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.