You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by "Mamta A. Satoor (JIRA)" <ji...@apache.org> on 2007/04/04 20:22:32 UTC

[jira] Updated: (DERBY-2524) DataTypeDescriptor(DTD) needs to have collation type and collation derivation. These new fields will apply only for character string types. Other types should ignore them.

     [ https://issues.apache.org/jira/browse/DERBY-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mamta A. Satoor updated DERBY-2524:
-----------------------------------

    Attachment: DERBY2524_Collation_Info_In_DTD_v1_stat.txt
                DERBY2524_Collation_Info_In_DTD_v1_diff.txt

I just finished commiting a patch (DERBY2524_Collation_Info_In_DTD_v1_diff.txt) which is attached to this Jira entry. The patch got committed as part of revision 525568 and it does following 2 things
1)Add collation type and collation derivation attributes and apis to TypeDescriptor interface and it's implementations.
2)Save the collation type in the scale field of character types in writeExternal method of TypeDescriptorImpl. And read the scale field into the collation type for character types in readExternal method of TypeDescriptorImpl. 

svn stat -q
M      java\engine\org\apache\derby\iapi\types\DataTypeDescriptor.java
M      java\engine\org\apache\derby\catalog\TypeDescriptor.java
M      java\engine\org\apache\derby\catalog\types\TypeDescriptorImpl.java

Details of the patch
1)Added getters and setters for collationType and collationDerivation in TypeDescriptor. In addition, TypeDescriptor has new constants defined in them which will be used by the rest of the collation related code in Derby. One of the constants is COLLATION_DERIVATION_INCORRECT. I am initializing the collation derivation for all the data types to COLLATION_DERIVATION_INCORRECT in TypeDescriptorImpl. This should get changed to "implicit" or "none" for character string types before the runtime code kicks in. For all the other types, it will remain set to COLLATION_DERIVATION_INCORRECT because collation does not apply to those data types.
2)DTD implements the new apis in the TypeDescriptor interface.
3)2 set of changes went into 
a)TypeDescriptorImpl has 2 new fields, namely, collationType and collationDerivation. collationDerivation is initialized to TypeDescriptor.COLLATION_DERIVATION_INCORRECT. For character string types, these field should get set correctly. In addition, there are apis to set and get values out of these 2 fields.
b)The next change for this class is in writeExternal and readExternal methods. I would like community's feedback on my assumption for this particular change. The collation type of a character string type will get saved in the existing scale field since scale does not apply to character string types. My question is about collation derivation. The collation derivation infromation does not get saved like collation type. But may be that is ok because I am assuming that writeExternal and readExternal get called only for the persistent columns (ie columns belonging to system and user tables). Collation derivation of such character string columns (coming from persistent tables) is always implicit. And, hence in readExternal, for character string types, I can initialize collation derivation to be implicit. My assumption is that readExternal and writeExternal methods will never get called for character string types with collation of none or explicit. Today we don't have explicit as one of the possible values for collation derivation, but a character string type will have the collation derivation of none if it was the result of an aggregate method involving operands with different collation derivations. This comes from item 11) from Section Collation Determination at http://wiki.apache.org/db-derby/BuiltInLanguageBasedOrderingDERBY-1478

Questions
1)I have included all the constant definitions related to collation in TypeDescriptor. If anyone has suggestion on a better place to define them, let me know. Wonder if there is already a class to define miscellaneous constant definitions like the ones I have added. TypeDescriptor does look like a good place for these constants defined by me because these constants all belong to the data type world.
2)Is it right to assume that readExternal and writeExternal methods in TypeDescriptorImpl will get called only for persistent columns?

Although the patch is committed, please feel free to provide feedback on it. I will especially appreciate any feedback on my questions above.

> DataTypeDescriptor(DTD) needs to have collation type and collation derivation. These new fields will apply only for character string types. Other types should ignore them.
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-2524
>                 URL: https://issues.apache.org/jira/browse/DERBY-2524
>             Project: Derby
>          Issue Type: New Feature
>          Components: SQL
>    Affects Versions: 10.3.0.0
>            Reporter: Mamta A. Satoor
>         Assigned To: Mamta A. Satoor
>         Attachments: DERBY2524_Collation_Info_In_DTD_v1_diff.txt, DERBY2524_Collation_Info_In_DTD_v1_stat.txt
>
>
> This the one of the ground works for getting different kinds of collations working for character string types. More information on this project can be found at http://wiki.apache.org/db-derby/BuiltInLanguageBasedOrderingDERBY-1478. Basically, all the types in Derby have a DTD associated with them. For character string types, these DTDs should have valid values for collation derivation and collation type. For other data types, these 2 fields do not apply and should be ignored.
> SQL spec talks about character string types having collation type and collation derivation associated with them (SQL spec Section 4.2.2 Comparison of character strings). If collation derivation says explicit or implicit, then it means that there is a valid collation type associated with the charcter string type. If the collation derivation is none, then it means that collation type can't be established for the character string type. 
> 1)Collation derivation will be explicit if COLLATE clause has been used for character string type (this is not a possibility for Derby 10.3, because we are not planning to support SQL COLLATE clause in this release). 
> 2)Collation derivation will be implicit if the collation can be determined w/o the COLLATE clause eg CREATE TABLE t1(c11 char(4)) then c11 will have collation of USER character set. Another eg, TRIM(c11) then the result character string of TRIM operation will have collation of the operand, c11. 
> 3)Collation derivation will be none if the aggregate methods are dealing with character strings with different collations (Section 9.3 Data types of results of aggregations Syntax Rule 3aii). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Re: [jira] Updated: (DERBY-2524) DataTypeDescriptor(DTD) needs to have collation type and collation derivation. These new fields will apply only for character string types. Other types should ignore them.

Posted by Daniel John Debrunner <dj...@apache.org>.
Daniel John Debrunner wrote:

> Why string constants for derivation though, why not integer values?
> Also there's no constant for error, though I would not recommend having 
> an error value. Why not just use none as the default value? That will 
> lead to the same behaviour, no collation suppported. Adding an error 
> state is something that is not in the SQL standard.

sorry misread error as explicit since I was expecting an explicit type 
to be defined. Still, not sure why an "error" state is needed.

Dan.



Re: [jira] Updated: (DERBY-2524) DataTypeDescriptor(DTD) needs to have collation type and collation derivation. These new fields will apply only for character string types. Other types should ignore them.

Posted by Mamta Satoor <ms...@gmail.com>.
Copied from Dan's mail
*************************************************************
> 2)Is it right to assume that readExternal and writeExternal methods in
> TypeDescriptorImpl will get called only for persistent columns?


Probably, though looking forward at some point the derivation will need
to be stored, might be worth thinking about this now to avoid future
upgrades.

*************************************************************
I wonder if precision field is available for character string types. If it
is available, then we can possibly use it to store collation derivation (I
will convert collation derivation to int in my followup patch) for character
string types.

Mamta



On 4/4/07, Daniel John Debrunner <dj...@apache.org> wrote:
>
> Mamta A. Satoor (JIRA) wrote:
>
>
> > Questions
> > 1)I have included all the constant definitions related to collation in
> TypeDescriptor.
> > If anyone has suggestion on a better place to define them, let me know.
>
> I'd thought about defining such constants on StringDataValue since they
> only apply to character types.
>
> Why string constants for derivation though, why not integer values?
> Also there's no constant for error, though I would not recommend having
> an error value. Why not just use none as the default value? That will
> lead to the same behaviour, no collation suppported. Adding an error
> state is something that is not in the SQL standard.
>
> Note that the javadoc comments you added for these constants are only
> for the first constant, the others will have no javadoc.
>
> > 2)Is it right to assume that readExternal and writeExternal methods in
> > TypeDescriptorImpl will get called only for persistent columns?
>
> Probably, though looking forward at some point the derivation will need
> to be stored, might be worth thinking about this now to avoid future
> upgrades.
>
> Dan.
>
>
>
>

Re: [jira] Updated: (DERBY-2524) DataTypeDescriptor(DTD) needs to have collation type and collation derivation. These new fields will apply only for character string types. Other types should ignore them.

Posted by Daniel John Debrunner <dj...@apache.org>.
Mamta A. Satoor (JIRA) wrote:


> Questions
> 1)I have included all the constant definitions related to collation in TypeDescriptor.
> If anyone has suggestion on a better place to define them, let me know.

I'd thought about defining such constants on StringDataValue since they 
only apply to character types.

Why string constants for derivation though, why not integer values?
Also there's no constant for error, though I would not recommend having 
an error value. Why not just use none as the default value? That will 
lead to the same behaviour, no collation suppported. Adding an error 
state is something that is not in the SQL standard.

Note that the javadoc comments you added for these constants are only 
for the first constant, the others will have no javadoc.

> 2)Is it right to assume that readExternal and writeExternal methods in
> TypeDescriptorImpl will get called only for persistent columns?

Probably, though looking forward at some point the derivation will need 
to be stored, might be worth thinking about this now to avoid future 
upgrades.

Dan.