You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by "Kristian Waagan (JIRA)" <ji...@apache.org> on 2008/10/10 14:15:44 UTC

[jira] Commented: (DERBY-3907) Save useful length information for Clobs in store

    [ https://issues.apache.org/jira/browse/DERBY-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12638538#action_12638538 ] 

Kristian Waagan commented on DERBY-3907:
----------------------------------------

A few starting points for discussion follows (all about the meta-information for Clobs).
I have assumed the following prerequisites:
 1) Clob modifications are done on a copy (i.e. TemporaryClob).
 2) The meta-information is of fixed length and at the start of the data stream (first page), so that it can be updated after the data has been streamed to store.

a) Format specification byte
    Shall we use a format specification ("magic number") byte?

b) Maximum Clob length (in characters)
    How many bits shall we use for the Clob length?
    Is representing todays maximum (2G-1) enough, or should we make some headroom?

c) Storing byte length
    I mentioned storing the byte length as well, but haven't found any strong use cases.
    Opinions?

[Optimizations]

d) Bytes per character information
    Use a few bits to save byte per character information, which can be used to optimize positioning.
    If the value is different than 0, one can calculate the byte position from the char position without decoding the stream.
    This information must be obtained by looking at all the bytes in the Clob, typically when inserting it.
    Example with 2 bits:
      0 = unknown/mixed
      1 = one byte per char
      2 = two bytes per char
      3 = three bytes per char

e) Save "key positions" for the Clob
    For instance save the char/byte positions for 25%, 50% and 75% of the Clob.
    This increases space overhead, but reduces the decoding/positioning costs for large Clobs.
    Also adds some complexity to the positioning logic in upper layer code (i.e. above store).


Please comment on these issues.
Information about the upgrade issue is also appreciated.

> Save useful length information for Clobs in store
> -------------------------------------------------
>
>                 Key: DERBY-3907
>                 URL: https://issues.apache.org/jira/browse/DERBY-3907
>             Project: Derby
>          Issue Type: Improvement
>          Components: JDBC, Store
>    Affects Versions: 10.5.0.0
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>
> The store should save useful length information for Clobs. This allows the length to be found without decoding the whole data stream.
> The following thread raised the issue on what information to store, and also contains some background information: http://www.nabble.com/Storing-length-information-for-CLOB-on-disk-tp19197535p19197535.html
> The information to store, and the exact format of it, is still to be discussed/determined.
> Currently two bytes are set aside for length information, which is inadequate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.