You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by Kristian Waagan <Kr...@Sun.COM> on 2008/11/03 19:49:32 UTC
Suggestion for improving ClobUpdatableReader and related code
Hello,
While investigating the LOB code, it occurred to me that
ClobUpdatableReader and some related code could be changed for
two main reasons; performance and code readability/simplification. I'll
focus on the latter one in this mail.
At the moment, the updatable reader functionality is tightly coupled to
TemporaryClob and has special handling for the various internal Clob
representations. I believe the functionality can be provided efficiently
on a more general level, and the responsibilities can also be more
clearly separated.
Below I try to outline a solution to this problem - I would like some
high level feedback on whether the suggestion is sound or not.
(please ask if needed, the description omits information in an attempt
to keep it short)
* Responsibilities
- positioning: handled by/through UTF8Reader.
- detecting modifications and handling them: ClobUpdatableReader.
* New classes/interfaces
- PositionedStream (generalization of PositionedStoreStream): extends
InputStream; getPosition(), reposition(long). The idea here is to
exploit the fact that TemporaryClob is directly addressable (by byte
position, *not* by char position).
- CharacterStreamDescriptor: a class containing information about a
byte stream representing characters. Will be passed in to UTF8Reader, so
that it can configure itself appropriately (current b/c pos, b/c length,
is bufferable/positionAware, max char length, dataOffset).
* Changes
- InternalClob: add isReleased() and getUpdateCount() to support the
updatable reader functionality. The first is used to check if the
internal representation has changed, the latter to detect content
modifications.
- ClobUpdatableReader: will be simplified, practically rewritten.
- UTF8Reader: new constructors and other minor changes. One notable
change is that it will no longer be this class' responsibility to read
the encoded length information in the streams from store. I'm hoping
this can be done in a utility class to avoid duplicating that code.
If I don't get any pushback on this suggestion, I will create a parent
Jira issue (probably describing the performance problem) and a set of
subtasks under it. The diff for my current prototype patch is at around
1200 lines.
--
Kristian