You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by "Kristian Waagan (JIRA)" <ji...@apache.org> on 2010/05/14 13:54:43 UTC
[jira] Created: (DERBY-4661) Reduce size of encoding buffer for
short character values
Reduce size of encoding buffer for short character values
---------------------------------------------------------
Key: DERBY-4661
URL: https://issues.apache.org/jira/browse/DERBY-4661
Project: Derby
Issue Type: Improvement
Components: JDBC
Affects Versions: 10.7.0.0
Environment: Inserts using setXStream(int, Reader/InputStream, int/long) for short values on character columns
Reporter: Kristian Waagan
Assignee: Kristian Waagan
Priority: Minor
When inserting character values Derby converts from Java char to an on-disk encoding of UTF-8. To to this, the user stream is read and the resulting bytes after conversion are placed in a "translation buffer". The default size of the buffer is 32 KB. When inserting a lot of short values, the pressure on the Java garbage collector is unnecessary high and the allocation/GC also causes a somewhat higher CPU usage.
This effect of this issue can easily be reduced by sizing the buffer in the appropriate cases.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (DERBY-4661) Reduce size of encoding buffer for
short character values
Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/DERBY-4661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kristian Waagan updated DERBY-4661:
-----------------------------------
Issue & fix info: [Patch Available]
> Reduce size of encoding buffer for short character values
> ---------------------------------------------------------
>
> Key: DERBY-4661
> URL: https://issues.apache.org/jira/browse/DERBY-4661
> Project: Derby
> Issue Type: Improvement
> Components: JDBC
> Affects Versions: 10.7.0.0
> Environment: Inserts using setXStream(int, Reader/InputStream, int/long) for short values on character columns
> Reporter: Kristian Waagan
> Assignee: Kristian Waagan
> Priority: Minor
> Attachments: derby-4661-1a-reduce_encoding_bz.diff, derby-4661-1a-reduce_encoding_bz.stat
>
>
> When inserting character values Derby converts from Java char to an on-disk encoding of UTF-8. To to this, the user stream is read and the resulting bytes after conversion are placed in a "translation buffer". The default size of the buffer is 32 KB. When inserting a lot of short values, the pressure on the Java garbage collector is unnecessary high and the allocation/GC also causes a somewhat higher CPU usage.
> This effect of this issue can easily be reduced by sizing the buffer in the appropriate cases.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (DERBY-4661) Reduce size of encoding buffer for
short character values
Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/DERBY-4661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kristian Waagan resolved DERBY-4661.
------------------------------------
Issue & fix info: (was: [Patch Available])
Fix Version/s: 10.7.0.0
Resolution: Fixed
> Reduce size of encoding buffer for short character values
> ---------------------------------------------------------
>
> Key: DERBY-4661
> URL: https://issues.apache.org/jira/browse/DERBY-4661
> Project: Derby
> Issue Type: Improvement
> Components: JDBC
> Affects Versions: 10.7.0.0
> Environment: Inserts using setXStream(int, Reader/InputStream, int/long) for short values on character columns
> Reporter: Kristian Waagan
> Assignee: Kristian Waagan
> Priority: Minor
> Fix For: 10.7.0.0
>
> Attachments: derby-4661-1a-reduce_encoding_bz.diff, derby-4661-1a-reduce_encoding_bz.stat, derby-4661-1b-reduce_encoding_bz.diff, derby-4661-1b-reduce_encoding_bz.diff
>
>
> When inserting character values Derby converts from Java char to an on-disk encoding of UTF-8. To to this, the user stream is read and the resulting bytes after conversion are placed in a "translation buffer". The default size of the buffer is 32 KB. When inserting a lot of short values, the pressure on the Java garbage collector is unnecessary high and the allocation/GC also causes a somewhat higher CPU usage.
> This effect of this issue can easily be reduced by sizing the buffer in the appropriate cases.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Closed: (DERBY-4661) Reduce size of encoding buffer for
short character values
Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/DERBY-4661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kristian Waagan closed DERBY-4661.
----------------------------------
Closing issue.
> Reduce size of encoding buffer for short character values
> ---------------------------------------------------------
>
> Key: DERBY-4661
> URL: https://issues.apache.org/jira/browse/DERBY-4661
> Project: Derby
> Issue Type: Improvement
> Components: JDBC
> Affects Versions: 10.7.0.0
> Environment: Inserts using setXStream(int, Reader/InputStream, int/long) for short values on character columns
> Reporter: Kristian Waagan
> Assignee: Kristian Waagan
> Priority: Minor
> Fix For: 10.5.3.1, 10.6.1.1, 10.7.0.0
>
> Attachments: derby-4661-1a-reduce_encoding_bz.diff, derby-4661-1a-reduce_encoding_bz.stat, derby-4661-1b-reduce_encoding_bz.diff, derby-4661-1b-reduce_encoding_bz.diff
>
>
> When inserting character values Derby converts from Java char to an on-disk encoding of UTF-8. To to this, the user stream is read and the resulting bytes after conversion are placed in a "translation buffer". The default size of the buffer is 32 KB. When inserting a lot of short values, the pressure on the Java garbage collector is unnecessary high and the allocation/GC also causes a somewhat higher CPU usage.
> This effect of this issue can easily be reduced by sizing the buffer in the appropriate cases.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (DERBY-4661) Reduce size of encoding buffer for
short character values
Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/DERBY-4661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kristian Waagan updated DERBY-4661:
-----------------------------------
Attachment: derby-4661-1a-reduce_encoding_bz.diff
derby-4661-1a-reduce_encoding_bz.stat
Attaching patch 1a.
* iapi/types/ReaderToUTF8Stream
The actual fix. Note the special case when the value length is zero. To avoid the issue for shorter header lengths in the future, I used Math.max instead of handling valueLength == 0 specifically.
* other classes
Added StreamHeaderGenerater.getMaxHdrLength() and made SQLClob use it.
Some simple tests showed a performance improvement of around 30%. Real world workloads will not see such a gain, but the fix may help heavily loaded servers somewhat where users are inserting small data values using the streaming classes.
Regression tests passed.
Patch ready for review.
> Reduce size of encoding buffer for short character values
> ---------------------------------------------------------
>
> Key: DERBY-4661
> URL: https://issues.apache.org/jira/browse/DERBY-4661
> Project: Derby
> Issue Type: Improvement
> Components: JDBC
> Affects Versions: 10.7.0.0
> Environment: Inserts using setXStream(int, Reader/InputStream, int/long) for short values on character columns
> Reporter: Kristian Waagan
> Assignee: Kristian Waagan
> Priority: Minor
> Attachments: derby-4661-1a-reduce_encoding_bz.diff, derby-4661-1a-reduce_encoding_bz.stat
>
>
> When inserting character values Derby converts from Java char to an on-disk encoding of UTF-8. To to this, the user stream is read and the resulting bytes after conversion are placed in a "translation buffer". The default size of the buffer is 32 KB. When inserting a lot of short values, the pressure on the Java garbage collector is unnecessary high and the allocation/GC also causes a somewhat higher CPU usage.
> This effect of this issue can easily be reduced by sizing the buffer in the appropriate cases.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (DERBY-4661) Reduce size of encoding buffer for
short character values
Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/DERBY-4661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kristian Waagan updated DERBY-4661:
-----------------------------------
Fix Version/s: 10.5.3.1
10.6.1.1
Backported to the 10.5 branch with revision 951363.
Backported to the 10.6 branch with revision 951362.
> Reduce size of encoding buffer for short character values
> ---------------------------------------------------------
>
> Key: DERBY-4661
> URL: https://issues.apache.org/jira/browse/DERBY-4661
> Project: Derby
> Issue Type: Improvement
> Components: JDBC
> Affects Versions: 10.7.0.0
> Environment: Inserts using setXStream(int, Reader/InputStream, int/long) for short values on character columns
> Reporter: Kristian Waagan
> Assignee: Kristian Waagan
> Priority: Minor
> Fix For: 10.5.3.1, 10.6.1.1, 10.7.0.0
>
> Attachments: derby-4661-1a-reduce_encoding_bz.diff, derby-4661-1a-reduce_encoding_bz.stat, derby-4661-1b-reduce_encoding_bz.diff, derby-4661-1b-reduce_encoding_bz.diff
>
>
> When inserting character values Derby converts from Java char to an on-disk encoding of UTF-8. To to this, the user stream is read and the resulting bytes after conversion are placed in a "translation buffer". The default size of the buffer is 32 KB. When inserting a lot of short values, the pressure on the Java garbage collector is unnecessary high and the allocation/GC also causes a somewhat higher CPU usage.
> This effect of this issue can easily be reduced by sizing the buffer in the appropriate cases.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (DERBY-4661) Reduce size of encoding buffer for
short character values
Posted by "Knut Anders Hatlen (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/DERBY-4661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12867505#action_12867505 ]
Knut Anders Hatlen commented on DERBY-4661:
-------------------------------------------
This looks like a useful improvement. A couple of minor comments:
I think the code would be easier to follow if some of the magic numbers were replaced by constants or expressions. In particular:
- the number 10922 could be replaced by bz/3. That would make it clearer where the number came from, and it would reduce the likelihood of bugs being introduced if we change the default buffer size later.
- using a constant (e.g., READ_BUFFER_RESERVATION) instead of the number 6 in ReaderToUTF8Stream's constructor and in fillBuffer() would make the relation between those methods easier to see.
Perhaps it would be better to call the new method in StreamHeaderGenerator getMaxHeaderLength()?
The call to LimitReader.setLimit() in the lengthless constructor was made redundant by the patch. Should it be removed?
> Reduce size of encoding buffer for short character values
> ---------------------------------------------------------
>
> Key: DERBY-4661
> URL: https://issues.apache.org/jira/browse/DERBY-4661
> Project: Derby
> Issue Type: Improvement
> Components: JDBC
> Affects Versions: 10.7.0.0
> Environment: Inserts using setXStream(int, Reader/InputStream, int/long) for short values on character columns
> Reporter: Kristian Waagan
> Assignee: Kristian Waagan
> Priority: Minor
> Attachments: derby-4661-1a-reduce_encoding_bz.diff, derby-4661-1a-reduce_encoding_bz.stat
>
>
> When inserting character values Derby converts from Java char to an on-disk encoding of UTF-8. To to this, the user stream is read and the resulting bytes after conversion are placed in a "translation buffer". The default size of the buffer is 32 KB. When inserting a lot of short values, the pressure on the Java garbage collector is unnecessary high and the allocation/GC also causes a somewhat higher CPU usage.
> This effect of this issue can easily be reduced by sizing the buffer in the appropriate cases.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (DERBY-4661) Reduce size of encoding buffer for
short character values
Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/DERBY-4661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kristian Waagan updated DERBY-4661:
-----------------------------------
Attachment: derby-4661-1b-reduce_encoding_bz.diff
Attached patch 1b.
Thanks for reviewing the patch, Knut Anders.
I have addressed your suggestions, and I also made a few other changes (size logic, fixed two typos).
Regression tests passed (suites.All).
Patch ready for a second review.
> Reduce size of encoding buffer for short character values
> ---------------------------------------------------------
>
> Key: DERBY-4661
> URL: https://issues.apache.org/jira/browse/DERBY-4661
> Project: Derby
> Issue Type: Improvement
> Components: JDBC
> Affects Versions: 10.7.0.0
> Environment: Inserts using setXStream(int, Reader/InputStream, int/long) for short values on character columns
> Reporter: Kristian Waagan
> Assignee: Kristian Waagan
> Priority: Minor
> Attachments: derby-4661-1a-reduce_encoding_bz.diff, derby-4661-1a-reduce_encoding_bz.stat, derby-4661-1b-reduce_encoding_bz.diff
>
>
> When inserting character values Derby converts from Java char to an on-disk encoding of UTF-8. To to this, the user stream is read and the resulting bytes after conversion are placed in a "translation buffer". The default size of the buffer is 32 KB. When inserting a lot of short values, the pressure on the Java garbage collector is unnecessary high and the allocation/GC also causes a somewhat higher CPU usage.
> This effect of this issue can easily be reduced by sizing the buffer in the appropriate cases.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (DERBY-4661) Reduce size of encoding buffer for
short character values
Posted by "Knut Anders Hatlen (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/DERBY-4661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12871060#action_12871060 ]
Knut Anders Hatlen commented on DERBY-4661:
-------------------------------------------
Thanks for making these changes, Kristian. The 1b patch looks good to me.
I assume this part of the patch was unintended, though:
Property changes on: java/engine/org/apache/derby/iapi/types
___________________________________________________________________
Added: svn:ignore
+ .ReaderToUTF8Stream.java.swp
> Reduce size of encoding buffer for short character values
> ---------------------------------------------------------
>
> Key: DERBY-4661
> URL: https://issues.apache.org/jira/browse/DERBY-4661
> Project: Derby
> Issue Type: Improvement
> Components: JDBC
> Affects Versions: 10.7.0.0
> Environment: Inserts using setXStream(int, Reader/InputStream, int/long) for short values on character columns
> Reporter: Kristian Waagan
> Assignee: Kristian Waagan
> Priority: Minor
> Attachments: derby-4661-1a-reduce_encoding_bz.diff, derby-4661-1a-reduce_encoding_bz.stat, derby-4661-1b-reduce_encoding_bz.diff
>
>
> When inserting character values Derby converts from Java char to an on-disk encoding of UTF-8. To to this, the user stream is read and the resulting bytes after conversion are placed in a "translation buffer". The default size of the buffer is 32 KB. When inserting a lot of short values, the pressure on the Java garbage collector is unnecessary high and the allocation/GC also causes a somewhat higher CPU usage.
> This effect of this issue can easily be reduced by sizing the buffer in the appropriate cases.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (DERBY-4661) Reduce size of encoding buffer for
short character values
Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/DERBY-4661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kristian Waagan updated DERBY-4661:
-----------------------------------
Attachment: derby-4661-1b-reduce_encoding_bz.diff
Yes, that change wasn't intended (my IDE did that for me).
Uploaded a new rev of patch 1b, and committed it to trunk with revision 948069.
Thanks,
> Reduce size of encoding buffer for short character values
> ---------------------------------------------------------
>
> Key: DERBY-4661
> URL: https://issues.apache.org/jira/browse/DERBY-4661
> Project: Derby
> Issue Type: Improvement
> Components: JDBC
> Affects Versions: 10.7.0.0
> Environment: Inserts using setXStream(int, Reader/InputStream, int/long) for short values on character columns
> Reporter: Kristian Waagan
> Assignee: Kristian Waagan
> Priority: Minor
> Attachments: derby-4661-1a-reduce_encoding_bz.diff, derby-4661-1a-reduce_encoding_bz.stat, derby-4661-1b-reduce_encoding_bz.diff, derby-4661-1b-reduce_encoding_bz.diff
>
>
> When inserting character values Derby converts from Java char to an on-disk encoding of UTF-8. To to this, the user stream is read and the resulting bytes after conversion are placed in a "translation buffer". The default size of the buffer is 32 KB. When inserting a lot of short values, the pressure on the Java garbage collector is unnecessary high and the allocation/GC also causes a somewhat higher CPU usage.
> This effect of this issue can easily be reduced by sizing the buffer in the appropriate cases.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.