You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by "Kristian Waagan (JIRA)" <ji...@apache.org> on 2008/10/07 14:15:44 UTC

[jira] Updated: (DERBY-3769) Make LOBStoredProcedure on the server side smarter about the read buffer size

     [ https://issues.apache.org/jira/browse/DERBY-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kristian Waagan updated DERBY-3769:
-----------------------------------

    Attachment: derby-3769-2a-clob_buffer_size_adjustment.diff

Patch 2a adjusts the maximum return size in characters for the CLOB stored procedure to 10890 (DB2_VARCHAR_MAXWIDTH / 3). This potentially results in anything from 10890 to 10890*3 bytes to be returned to the client in one round-trip, depending on the bytes per char ratio (determined by the modified UTF8 encoding).

Even though this fix isn't optimal, the advantages outweigh the disadvantages in my opinion.
I did a simple test, where I used a 32K buffer size in the client code to retrieve a 32M chars long CLOB consisting of CJK chars (3 bytes per char).
With the fix the it took around 17 seconds, without it took almost 3400 seconds! In both cases a patch for DERBY-3825 was applied.
I also did a test with a 32MB CLOB containing ASCII characters, where I saw a performance reduction of around 3% (test run on a LAN, performance reduction will increase with higher latency networks).

If you want to test performance yourself, you must first apply the patch for DERBY-3825 (2a). The problems are described under DERBY-3766.

Patch ready for review.

> Make LOBStoredProcedure on the server side smarter about the read buffer size
> -----------------------------------------------------------------------------
>
>                 Key: DERBY-3769
>                 URL: https://issues.apache.org/jira/browse/DERBY-3769
>             Project: Derby
>          Issue Type: Improvement
>          Components: Network Server
>    Affects Versions: 10.3.3.0, 10.4.1.3, 10.5.0.0
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>             Fix For: 10.4.2.1, 10.5.0.0
>
>         Attachments: derby-3769-1a-buffer_size_adjustment.diff, derby-3769-1b-buffer_size_adjustment.diff, derby-3769-2a-clob_buffer_size_adjustment.diff
>
>
> Derby has a max length for VARBINARY and VARCHAR, which is 32'672 bytes or characters (see Limits.DB2_VARCHAR_MAXWIDTH).
> When working with LOBs represented by locators, using a read buffer larger than the max value causes the server to process far more data than necessary.
> Say the read buffer is 33'000 bytes, and these bytes are requested by the client. This requests ends up in LOBStoredProcedure.BLOBGETBYTES.
> Assume the stream position is 64'000, and this is where we want to read from. The following happens:
>  a) BLOBGETBYTES instructs EmbedBlob to read 33'000 bytes, advancing the stream position to 97'000.
>  b) Derby fetches/receives the 33'000 bytes, but can only send 32'672. The rest of the data (328 bytes) is discarded.
>  c) The client receives the 32'672 bytes, recalculates the position and length arguments and sends another request.
>  d) BLOBGETBYTES(locator, 96672, 328) is executed. EmbedBlob detects that the stream position has advanced too far, so it resets the stream to position zero and skips/reads until position 96'672 has been reached.
>  e) The remaining 328 bytes are sent to the client.
> This issue deals with points b) and d), by avoiding the need to reset the stream.
> Points a) and e) are also problematic if a large number of bytes are going to be read, say hundreds of megabytes, but that's another issue.
> It is unfortunate that using 32 K (32 * 1024) as the buffer size is almost the worst case; 32'768 - 32'672 = 96 bytes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.