You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by "Kristian Waagan (JIRA)" <ji...@apache.org> on 2008/07/22 10:54:32 UTC

[jira] Created: (DERBY-3791) Excessive memory usage when fetching small Clobs

Excessive memory usage when fetching small Clobs
------------------------------------------------

                 Key: DERBY-3791
                 URL: https://issues.apache.org/jira/browse/DERBY-3791
             Project: Derby
          Issue Type: Bug
          Components: JDBC
    Affects Versions: 10.4.1.3, 10.3.1.4, 10.2.2.0, 10.5.0.0
            Reporter: Kristian Waagan
            Assignee: Kristian Waagan
            Priority: Minor


When investigating DERBY-3312 I found out that performance with the embedded driver has decreased a lot as well.
Analysis on trunk indicate excessive memory usage, causing high allocation and garbage collection costs.

I believe there was another major performance problem in 10.3, but it appears fixed in trunk. I have not spent time identifying this problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (DERBY-3791) Excessive memory usage when fetching small Clobs

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-3791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kristian Waagan updated DERBY-3791:
-----------------------------------

    Derby Info: [Patch Available, Regression]  (was: [Regression])

Ticked patch available flag.
I hope to backport patch 1 to 10.4 and possibly 10.3.
Regarding StringClob, it will only go into trunk and I will create a subissue for it.

> Excessive memory usage when fetching small Clobs
> ------------------------------------------------
>
>                 Key: DERBY-3791
>                 URL: https://issues.apache.org/jira/browse/DERBY-3791
>             Project: Derby
>          Issue Type: Bug
>          Components: JDBC
>    Affects Versions: 10.2.2.0, 10.3.1.4, 10.4.1.3, 10.5.0.0
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>            Priority: Minor
>         Attachments: derby-3791-1a-buffer_fix.diff
>
>
> When investigating DERBY-3312 I found out that performance with the embedded driver has decreased a lot as well.
> Analysis on trunk indicate excessive memory usage, causing high allocation and garbage collection costs.
> I believe there was another major performance problem in 10.3, but it appears fixed in trunk. I have not spent time identifying this problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Closed: (DERBY-3791) Excessive memory usage when fetching small Clobs

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-3791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kristian Waagan closed DERBY-3791.
----------------------------------


Closed the issue.
There might still be a small optimization possibility for small Clobs (not stored as a stream) by not creating a TemporaryClob for them, but I don't think it is worth the hassle at the moment.

> Excessive memory usage when fetching small Clobs
> ------------------------------------------------
>
>                 Key: DERBY-3791
>                 URL: https://issues.apache.org/jira/browse/DERBY-3791
>             Project: Derby
>          Issue Type: Bug
>          Components: JDBC
>    Affects Versions: 10.2.2.0, 10.3.1.4, 10.4.1.3, 10.5.0.0
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>            Priority: Minor
>             Fix For: 10.4.2.0, 10.5.0.0
>
>         Attachments: derby-3791-1a-buffer_fix.diff, derby-3791-2a-buffer-adjustments.diff
>
>
> When investigating DERBY-3312 I found out that performance with the embedded driver has decreased a lot as well.
> Analysis on trunk indicate excessive memory usage, causing high allocation and garbage collection costs.
> I believe there was another major performance problem in 10.3, but it appears fixed in trunk. I have not spent time identifying this problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (DERBY-3791) Excessive memory usage when fetching small Clobs

Posted by "Dag H. Wanvik (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-3791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dag H. Wanvik updated DERBY-3791:
---------------------------------

    Bug behavior facts: [Performance, Regression]  (was: [Regression])

> Excessive memory usage when fetching small Clobs
> ------------------------------------------------
>
>                 Key: DERBY-3791
>                 URL: https://issues.apache.org/jira/browse/DERBY-3791
>             Project: Derby
>          Issue Type: Bug
>          Components: JDBC
>    Affects Versions: 10.2.2.0, 10.3.1.4, 10.4.1.3, 10.5.1.1
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>            Priority: Minor
>             Fix For: 10.4.2.0, 10.5.1.1
>
>         Attachments: derby-3791-1a-buffer_fix.diff, derby-3791-2a-buffer-adjustments.diff
>
>
> When investigating DERBY-3312 I found out that performance with the embedded driver has decreased a lot as well.
> Analysis on trunk indicate excessive memory usage, causing high allocation and garbage collection costs.
> I believe there was another major performance problem in 10.3, but it appears fixed in trunk. I have not spent time identifying this problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (DERBY-3791) Excessive memory usage when fetching small Clobs

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-3791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12615582#action_12615582 ] 

Kristian Waagan commented on DERBY-3791:
----------------------------------------

I ran some tests using SingleRecordSelectClient (found in the svn repository).
I used default settings and ran five times with one client and five times with 16 clients. The results are "normalized", where the throughput obtained with 10.2.2.1 is defined as 100%.

100% 100% 10.2.2.1
075% 216% 10.3.3.0
079% 231% 10.4.1.3
078% 231% trunk
146% 394% trunk-buffer-fix
167% 456% trunk-StringClob-fix

As can be seen, we do have a regression when using one client. The results obtained with SingleRecordSelectClient are a bit different than those I saw with the LobPerf repro posted under DERBY-3312. This might be because the repro doesn't commit after each select, but SingleRecordSelectClient does. This can cause the list of open Clobs to get large and might further reduce the performance.

I've played with two fixes. One is really simple, where one buffer is removed and another one is adjusted according to the Clob size. Without the fix, we allocated at least 32KB character buffers for each Clob. Needless to say, this is quite a lot when the Clob itself has between one and five characters.

The second fix adds another InternalClob implementation, which I have called StringClob. This is used for Clobs that are too small to be represented as streams. Currently these Clobs are represented as a byte array in memory. It turns out we start out with a byte array (from store), convert it to a string, convert it back to a byte array and then finally we convert whatever data the user requests to strings again.
Adding yet another internal Clob representation is not exactly good regarding testing and code volume, but because it is so simple I consider doing it for the extra performance.

Before I continue working on the StringClob fix, I want to clean up the InternalClob interface slightly.

> Excessive memory usage when fetching small Clobs
> ------------------------------------------------
>
>                 Key: DERBY-3791
>                 URL: https://issues.apache.org/jira/browse/DERBY-3791
>             Project: Derby
>          Issue Type: Bug
>          Components: JDBC
>    Affects Versions: 10.2.2.0, 10.3.1.4, 10.4.1.3, 10.5.0.0
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>            Priority: Minor
>
> When investigating DERBY-3312 I found out that performance with the embedded driver has decreased a lot as well.
> Analysis on trunk indicate excessive memory usage, causing high allocation and garbage collection costs.
> I believe there was another major performance problem in 10.3, but it appears fixed in trunk. I have not spent time identifying this problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (DERBY-3791) Excessive memory usage when fetching small Clobs

Posted by "Knut Anders Hatlen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-3791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12616856#action_12616856 ] 

Knut Anders Hatlen commented on DERBY-3791:
-------------------------------------------

The patch looks fine to me.

Just to check that I've understood correctly, removing the BufferedInputStream won't affect the performance negatively because the underlying stream gets its data from a byte array, so buffering the data in another byte array on top of that won't improve the speed at which data can be read. Is that about right?

> Excessive memory usage when fetching small Clobs
> ------------------------------------------------
>
>                 Key: DERBY-3791
>                 URL: https://issues.apache.org/jira/browse/DERBY-3791
>             Project: Derby
>          Issue Type: Bug
>          Components: JDBC
>    Affects Versions: 10.2.2.0, 10.3.1.4, 10.4.1.3, 10.5.0.0
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>            Priority: Minor
>         Attachments: derby-3791-1a-buffer_fix.diff
>
>
> When investigating DERBY-3312 I found out that performance with the embedded driver has decreased a lot as well.
> Analysis on trunk indicate excessive memory usage, causing high allocation and garbage collection costs.
> I believe there was another major performance problem in 10.3, but it appears fixed in trunk. I have not spent time identifying this problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (DERBY-3791) Excessive memory usage when fetching small Clobs

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-3791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kristian Waagan updated DERBY-3791:
-----------------------------------

    Derby Info: [Regression]  (was: [Regression, Patch Available])

That is about right.
Of course there is an exception, and I'm not sure how to handle it yet.
When a Clob is created with Connection.createClob and grows big enough, or a Clob is fetched from the database and then modified, data is copied to a temporary file on disk.
If you then read from this Clob again, you get a stack that looks something like this (seen from Clob.getCharacterStream):
  ClobUpdatableReader
    UTF8Reader
      LOBInputStream
        LOBStreamControl
          LOBFile
            (Storage)RandomAccessFile

As far as I can see, this stack will be unbuffered (in our Java code, don't know how the OS can help here) after patch 1a is applied.
The code is getting pretty complex down in LOBStreamControl and LOBFile, so I think patch 1a has to be reworked to account for the buffering in a sensible way higher up in the stack - probably in UTF8Reader.

I didn't see any performance degradation in suites.All with the patch, but that probably just means we don't test this code path very well. I'll see if I can write a series of tests and maybe put them under java/testing/org/apache/derbyTesting/perf/basic/jdbc/.

I'm clearing the patch available flag, patch 1a will not be committed.

> Excessive memory usage when fetching small Clobs
> ------------------------------------------------
>
>                 Key: DERBY-3791
>                 URL: https://issues.apache.org/jira/browse/DERBY-3791
>             Project: Derby
>          Issue Type: Bug
>          Components: JDBC
>    Affects Versions: 10.2.2.0, 10.3.1.4, 10.4.1.3, 10.5.0.0
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>            Priority: Minor
>         Attachments: derby-3791-1a-buffer_fix.diff
>
>
> When investigating DERBY-3312 I found out that performance with the embedded driver has decreased a lot as well.
> Analysis on trunk indicate excessive memory usage, causing high allocation and garbage collection costs.
> I believe there was another major performance problem in 10.3, but it appears fixed in trunk. I have not spent time identifying this problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (DERBY-3791) Excessive memory usage when fetching small Clobs

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-3791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kristian Waagan updated DERBY-3791:
-----------------------------------

    Attachment: derby-3791-1a-buffer_fix.diff

'derby-3791-1a-buffer_fix.diff' makes UTF8Reader allocate a smaller buffer if possible.
The following alternatives are considered (if possible):
 - length of the byte stream (if available)
 - the maximum allowed length for the field (if set)
 - the maximum size of the buffer

In addition, the buffering of the internal stream is removed because it is already buffered at lower levels.

Note that the buffer can still be too large because we get information about the size in number of bytes. Since we don't know how many chars we get from the number of bytes, I decided to assume one char per byte. The patch will improve the situation for small streams anyway. For larger streams, the maximum buffer size will be used (8 KB).

suites.All passed.
Patch ready for review.

> Excessive memory usage when fetching small Clobs
> ------------------------------------------------
>
>                 Key: DERBY-3791
>                 URL: https://issues.apache.org/jira/browse/DERBY-3791
>             Project: Derby
>          Issue Type: Bug
>          Components: JDBC
>    Affects Versions: 10.2.2.0, 10.3.1.4, 10.4.1.3, 10.5.0.0
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>            Priority: Minor
>         Attachments: derby-3791-1a-buffer_fix.diff
>
>
> When investigating DERBY-3312 I found out that performance with the embedded driver has decreased a lot as well.
> Analysis on trunk indicate excessive memory usage, causing high allocation and garbage collection costs.
> I believe there was another major performance problem in 10.3, but it appears fixed in trunk. I have not spent time identifying this problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (DERBY-3791) Excessive memory usage when fetching small Clobs

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-3791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kristian Waagan resolved DERBY-3791.
------------------------------------

       Resolution: Fixed
    Fix Version/s: 10.5.0.0
                   10.4.2.0

Backported to 10.4 with revision 682813.
I tried merging to 10.3, but got a single conflict (didn't have time to investigate, might be a simple one to fix).

> Excessive memory usage when fetching small Clobs
> ------------------------------------------------
>
>                 Key: DERBY-3791
>                 URL: https://issues.apache.org/jira/browse/DERBY-3791
>             Project: Derby
>          Issue Type: Bug
>          Components: JDBC
>    Affects Versions: 10.2.2.0, 10.3.1.4, 10.4.1.3, 10.5.0.0
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>            Priority: Minor
>             Fix For: 10.4.2.0, 10.5.0.0
>
>         Attachments: derby-3791-1a-buffer_fix.diff, derby-3791-2a-buffer-adjustments.diff
>
>
> When investigating DERBY-3312 I found out that performance with the embedded driver has decreased a lot as well.
> Analysis on trunk indicate excessive memory usage, causing high allocation and garbage collection costs.
> I believe there was another major performance problem in 10.3, but it appears fixed in trunk. I have not spent time identifying this problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (DERBY-3791) Excessive memory usage when fetching small Clobs

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-3791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kristian Waagan updated DERBY-3791:
-----------------------------------

    Attachment: derby-3791-2a-buffer-adjustments.diff

'derby-3791-2a-buffer-adjustments.diff' adjusts the buffer sizes of the internal char array and the BufferedInputStream.
If the Clob is known to be smaller than 8 KB (maximum allowed buffer size), the buffer is set to match the Clob size.
Note that the internal buffer is for characters, whereas the stream buffers bytes. The same size is used for both.

Committed to trunk with revision 681723.

> Excessive memory usage when fetching small Clobs
> ------------------------------------------------
>
>                 Key: DERBY-3791
>                 URL: https://issues.apache.org/jira/browse/DERBY-3791
>             Project: Derby
>          Issue Type: Bug
>          Components: JDBC
>    Affects Versions: 10.2.2.0, 10.3.1.4, 10.4.1.3, 10.5.0.0
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>            Priority: Minor
>         Attachments: derby-3791-1a-buffer_fix.diff, derby-3791-2a-buffer-adjustments.diff
>
>
> When investigating DERBY-3312 I found out that performance with the embedded driver has decreased a lot as well.
> Analysis on trunk indicate excessive memory usage, causing high allocation and garbage collection costs.
> I believe there was another major performance problem in 10.3, but it appears fixed in trunk. I have not spent time identifying this problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.