You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by "Knut Anders Hatlen (JIRA)" <ji...@apache.org> on 2013/02/12 15:51:13 UTC

[jira] [Commented] (DERBY-5752) LOBStreamControl should materialize less aggressively

    [ https://issues.apache.org/jira/browse/DERBY-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13576651#comment-13576651 ] 

Knut Anders Hatlen commented on DERBY-5752:
-------------------------------------------

I had forgotten about this...

Now when I rerun the tests, I am not able to reproduce the big difference I saw in BlobClob4BlobTest in the first test run. I do still see a difference, but it's more like 165 seconds vs 180 seconds for the full BlobClob4BlobTest. As before, it looks like the entire difference is caused by testPositionAgressive() in an encrypted database, which slowed down from 7 seconds to 23 seconds in my environment. There is no difference in that test case on unencrypted databased.

The test case in question inserts a number of CLOBs, some of which are greater than the 32k limit for materialization, into a table. However, the query that reads the CLOBs is ordered on one of the non-CLOB columns, and the sorting materializes all the columns in the result. It eventually scans through the fetched CLOBs using Clob.position().

The performance difference is seen because the java.sql.Clob objects fetched from the result set are no longer fully materialized in memory with the patch, unless they are smaller than 32k. For the big objects, this means that each call to Clob.position() will have to read temporary files and decrypt the contents in order to search for the substring. Without the patch, the entire value would live unencrypted in memory, which makes position() a much cheaper operation.

I think this is an expected difference, and that it is acceptable since the CLOB wasn't supposed to be materialized in this scenario in the first place. Of course, the current limit for materialization might not be optimal for all applications, as materialization indeed could improve performance of some operations if the system has enough memory. Increasing the limit or making it tunable might be a useful improvement, but it's outside the scope of this issue.
                
> LOBStreamControl should materialize less aggressively
> -----------------------------------------------------
>
>                 Key: DERBY-5752
>                 URL: https://issues.apache.org/jira/browse/DERBY-5752
>             Project: Derby
>          Issue Type: Improvement
>          Components: JDBC
>    Affects Versions: 10.9.1.0
>            Reporter: Knut Anders Hatlen
>            Assignee: Knut Anders Hatlen
>         Attachments: buffsize.diff, d5752-1a.diff
>
>
> The constructor LOBStreamControl(EmbedConnection, byte[]) always makes the buffer size equal to the LOB size, effectively creating an extra, fully materialized copy of the LOB in memory.
> I think the assumption here is that a LOB that's already materialized is a small one. That is, LOBs that are smaller than 32 KB and fit in a single page are typically materialized when read from store. However, we sometimes materialize LOBs that are a lot bigger than 32 KB. For example, triggers that access LOBs may materialize them regardless of size (see comment in DMLWriteResultSet's constructor for details). For these large LOBs, it sounds unreasonable to allocate a buffer of the same size as the LOB itself.
> I'd suggest that we change the constructor so that it never allocates a buffer larger than 32KB. That would mean that the behaviour is preserved for all LOBs fetched directly from store (only LOBs that don't fit in a single page will cause temporary files to be created), whereas we'll prevent large LOBs accessed by triggers from being duplicated in memory by overflowing to temporary files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira