You are viewing a plain text version of this content. The canonical link for it is here.

Posted to derby-dev@db.apache.org by "Rick Hillegas (JIRA)" <de...@db.apache.org> on 2005/08/24 01:13:08 UTC

[jira] Created: (DERBY-533) Re-enable national character datatypes

Re-enable national character datatypes
--------------------------------------

Key: DERBY-533
URL: http://issues.apache.org/jira/browse/DERBY-533
Project: Derby
Type: New Feature
Components: SQL
Versions: 10.1.1.0
Reporter: Rick Hillegas

SQL 2003 coyly defines national character types as "implementation defined". Accordingly, there is considerable variability in how these datatypes behave. Oracle and MySQL use these datatypes to store unicode strings. This would not distinguish national from non-national character types in Derby since Derby stores all strings as unicode sequences.

The national character datatypes (NCHAR, NVARCHAR, NCLOB and their synonymns) used to exist in Cloudscape but were disabled in Derby. The disabling comment in the grammar says "need to re-enable according to SQL standard". Does this mean that the types were removed because they chafed against SQL 2003? If so, what are their defects?

------------------------------------------------------------------

Cloudscape 3.5 provided the following support for national character types:

- NCHAR and NVARCHAR were legal datatypes.
- Ordering operations on these datatypes was determined by the collating sequence associated with the locale of the database.
- The locale was a DATABASE-wide property which could not be altered.
- Ordering on non-national character datatypes was lexicographic, that is, character by character.

------------------------------------------------------------------

Oracle 9i provides the following support for national character types:

- NCHAR, NVARCHAR2, and NCLOB datatypes are used to store unicode strings.
- Sort order can be overridden per SESSION or even per QUERY, which means that these overridden sort orders are not supported by indexes.

------------------------------------------------------------------

DB2 does not appear to support national character types. Nor does its DRDA data interchange protocol.

------------------------------------------------------------------

MySQL provides the following support for national character types:

- National Char and National Varchar datatypes are used to hold unicode strings. I cannot find a national CLOB type.
- The character set and sort order can be changed at SERVER-wide, TABLE-wide, or COLUMN-specific levels.

------------------------------------------------------------------

If we removed the disabling logic in Derby, I believe that the following would happen:

- We would get NCHAR, NVARCHAR, and NCLOB datatypes.
- These would sort according to the locale that was bound to the database when it was created.
- We would have to build DRDA transport support for these types.

The difference between national and non-national datatypes would be their sort order.

I am keenly interested in understanding what defects (other than DRDA support) should be addressed in the disabled implementation.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-533) Re-enable national character datatypes

Posted by "Gert Brettlecker (JIRA)" <de...@db.apache.org>.

    [ http://issues.apache.org/jira/browse/DERBY-533?page=comments#action_12369426 ] 

Gert Brettlecker commented on DERBY-533:
----------------------------------------

In Derby 10.1.2.1, these national character datatypes like "LONG NVARCHAR" are deactivated only in the SQLParser Class by replacing with an SQL-Exception. Still these types are listed through JDBC if one queries java.sql.DatabaseMetaData.getTypeInfo(). 
If these national character types are not re-enabled, they should at least do not appear when querying the JDBC-Driver.  

> Re-enable national character datatypes
> --------------------------------------
>
>          Key: DERBY-533
>          URL: http://issues.apache.org/jira/browse/DERBY-533
>      Project: Derby
>         Type: New Feature
>   Components: SQL
>     Versions: 10.1.1.0
>     Reporter: Rick Hillegas

>
> SQL 2003 coyly defines national character types as "implementation defined". Accordingly, there is considerable variability in how these datatypes behave. Oracle and MySQL use these datatypes to store unicode strings. This would not distinguish national from non-national character types in Derby since Derby stores all strings as unicode sequences.
> The national character datatypes (NCHAR, NVARCHAR, NCLOB and their synonymns) used to exist in Cloudscape but were disabled in Derby. The disabling comment in the grammar says "need to re-enable according to SQL standard". Does this mean that the types were removed because they chafed against SQL 2003? If so, what are their defects?
> ------------------------------------------------------------------
> Cloudscape 3.5 provided the following support for national character types:
> - NCHAR and NVARCHAR were legal datatypes.
> - Ordering operations on these datatypes was determined by the collating sequence associated with the locale of the database.
> - The locale was a DATABASE-wide property which could not be altered.
> - Ordering on non-national character datatypes was lexicographic, that is, character by character.
> ------------------------------------------------------------------
> Oracle 9i provides the following support for national character types:
> - NCHAR, NVARCHAR2, and NCLOB datatypes are used to store unicode strings.
> - Sort order can be overridden per SESSION or even per QUERY, which means that these overridden sort orders are not supported by indexes.
> ------------------------------------------------------------------
> DB2 does not appear to support national character types. Nor does its DRDA data interchange protocol.
> ------------------------------------------------------------------
> MySQL provides the following support for national character types:
> - National Char and National Varchar datatypes are used to hold unicode strings. I cannot find a national CLOB type.
> - The character set and sort order can be changed at SERVER-wide, TABLE-wide, or COLUMN-specific levels.
> ------------------------------------------------------------------
> If we removed the disabling logic in Derby, I believe that the following would happen:
> - We would get NCHAR, NVARCHAR, and NCLOB datatypes.
> - These would sort according to the locale that was bound to the database when it was created.
> - We would have to build DRDA transport support for these types.
> The difference between national and non-national datatypes would be their sort order.
> I am keenly interested in understanding what defects (other than DRDA support) should be addressed in the disabled implementation.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Re: [jira] Commented: (DERBY-533) Re-enable national character datatypes

Posted by Rick Hillegas <Ri...@Sun.COM>.

Hi Roy,

Thanks for your helpful analysis. We should probably pay closer 
attention to character sets and collations, particularly since MySQL has 
invested so much effort here.

Cheers,
-Rick

Roy Lyseng wrote:

> Hi Rick,
>
> I have only studied the SQL 1992 standard concerning character sets, 
> hope my understanding is still valid (if it ever was).
>
> Both the CHAR and the NCHAR data types are actually the same data type 
> CHAR (or CHARACTER), but made up of characters from different 
> character sets. Each database has in effect two default character 
> sets, the one used for CHAR and the one used for NCHAR. But you may 
> also specify an explicit character set for a column as in NAME 
> CHARACTER(100) CHARACTER SET UTF8. The character set used for CHAR can 
> also be overridden per schema.
>
> Thus, when you create a database, you should be able to specify that 
> the default character set for CHAR columns be ASCII, and the character 
> set used for NCHAR be UTF8.
>
> Note also that according to the SQL standard, values of type CHAR but 
> with different character sets are not generally comparable.
>
> Each character set will also have a default collation. In a database 
> with full SQL support for character sets and collations, you might use 
> this to say that both CHAR and NCHAR store UTF16 characters, but that 
> CHAR has a binary collation and NCHAR has a French collation.
>
> SQL will also allow you to override a collation specification e.g. on 
> an ORDER BY statement, and though not specified by the SQL standard, 
> you might be able to create an index using a national ordering sequence.
>
> Cheers,
> Roy
>
> Rick Hillegas (JIRA) wrote:
>
>>     [ 
>> http://issues.apache.org/jira/browse/DERBY-533?page=comments#action_12319919 
>> ]
>> Rick Hillegas commented on DERBY-533:
>> -------------------------------------
>>
>> 1) There are some interesting issues here. Let's say that we 
>> re-enable these datatypes in 10.2. What happens when a client 
>> application selects from an NCHAR column under the following 
>> combinations? What should the ResultSetMetaData say the column is? Is 
>> the following reasonable?
>>
>>
>> | NETWORK CLIENT | CLIENT PLATFORM | RESULT TYPE |
>> |-----------------------------|-----------------------------|----------------------| 
>>
>> | Derby 10.2                 |  jdk1.4                        |   
>> NCHAR           |
>> |-----------------------------|-----------------------------|----------------------| 
>>
>> | Derby 10.2                 |  jdk1.6                        |   
>> NCHAR           |
>> |-----------------------------|-----------------------------|----------------------| 
>>
>> | Derby 10.1                 |  jdk1.4                        |   
>> CHAR              |
>> |-----------------------------|-----------------------------|----------------------| 
>>
>> | Derby 10.1                 |  jdk1.6                        |   
>> CHAR              |
>> |-----------------------------|-----------------------------|----------------------| 
>>
>> | db2jcc                        |  jdk1.4                        |   
>> CHAR              |
>> |-----------------------------|-----------------------------|----------------------| 
>>
>> | db2jcc                        |  jdk1.6                        |   
>> CHAR              |
>> |-----------------------------|-----------------------------|----------------------| 
>>
>>
>> Since all of our string datatypes are represented as unicode, I think 
>> it is ok, as necessary, to implicitly cast CHAR to NCHAR going from 
>> client to server.
>>
>> I also think it is reasonable to raise an exception if someone runs a 
>> 10.1 server against a 10.2 database.
>>
>> 2) I don't see where the SQL standard addresses coercion between 
>> national strings and other types. Part 2 section 4.2.1 says that 
>> NATIONAL CHARACTER is "implementation defined". Part 2 section 6.12 
>> lists legal and forbidden CASTS but says nothing about national 
>> string types. As always, I welcome being educated about what else 
>> might be relevant in the spec.
>>
>> Oracle supports the following coercions but not the inverse coercions 
>> and Oracle documentation does not address localization issues:
>>
>>    Datetime/Interval -> NCHAR/NVARCHAR2
>>    Number -> NCHAR/NVARCHAR2
>>
>> MySQL does not advertise any ability to cast to/from national strings.
>>
>> DB2 and Postgres do not support national strings.
>>
>> In summary, I do not see much guidance here. Derby's previous 
>> behavior seems reasonable to me: applying localization when coercing 
>> between national strings and other types.
>>
>>
>>
>>> Re-enable national character datatypes
>>> --------------------------------------
>>>
>>>         Key: DERBY-533
>>>         URL: http://issues.apache.org/jira/browse/DERBY-533
>>>     Project: Derby
>>>        Type: New Feature
>>>  Components: SQL
>>>    Versions: 10.1.1.0
>>>    Reporter: Rick Hillegas
>>
>>
>>
>>> SQL 2003 coyly defines national character types as "implementation 
>>> defined". Accordingly, there is considerable variability in how 
>>> these datatypes behave. Oracle and MySQL use these datatypes to 
>>> store unicode strings. This would not distinguish national from 
>>> non-national character types in Derby since Derby stores all strings 
>>> as unicode sequences.
>>> The national character datatypes (NCHAR, NVARCHAR, NCLOB and their 
>>> synonymns) used to exist in Cloudscape but were disabled in Derby. 
>>> The disabling comment in the grammar says "need to re-enable 
>>> according to SQL standard". Does this mean that the types were 
>>> removed because they chafed against SQL 2003? If so, what are their 
>>> defects?
>>> ------------------------------------------------------------------
>>> Cloudscape 3.5 provided the following support for national character 
>>> types:
>>> - NCHAR and NVARCHAR were legal datatypes.
>>> - Ordering operations on these datatypes was determined by the 
>>> collating sequence associated with the locale of the database.
>>> - The locale was a DATABASE-wide property which could not be altered.
>>> - Ordering on non-national character datatypes was lexicographic, 
>>> that is, character by character.
>>> ------------------------------------------------------------------
>>> Oracle 9i provides the following support for national character types:
>>> - NCHAR, NVARCHAR2, and NCLOB datatypes are used to store unicode 
>>> strings.
>>> - Sort order can be overridden per SESSION or even per QUERY, which 
>>> means that these overridden sort orders are not supported by indexes.
>>> ------------------------------------------------------------------
>>> DB2 does not appear to support national character types. Nor does 
>>> its DRDA data interchange protocol.
>>> ------------------------------------------------------------------
>>> MySQL provides the following support for national character types:
>>> - National Char and National Varchar datatypes are used to hold 
>>> unicode strings. I cannot find a national CLOB type.
>>> - The character set and sort order can be changed at SERVER-wide, 
>>> TABLE-wide, or COLUMN-specific levels.
>>> ------------------------------------------------------------------
>>> If we removed the disabling logic in Derby, I believe that the 
>>> following would happen:
>>> - We would get NCHAR, NVARCHAR, and NCLOB datatypes.
>>> - These would sort according to the locale that was bound to the 
>>> database when it was created.
>>> - We would have to build DRDA transport support for these types.
>>> The difference between national and non-national datatypes would be 
>>> their sort order.
>>> I am keenly interested in understanding what defects (other than 
>>> DRDA support) should be addressed in the disabled implementation.
>>
>>
>>
>
>

Re: [jira] Commented: (DERBY-533) Re-enable national character datatypes

Posted by Roy Lyseng <Ro...@Sun.COM>.

Hi Rick,

I have only studied the SQL 1992 standard concerning character sets, 
hope my understanding is still valid (if it ever was).

Both the CHAR and the NCHAR data types are actually the same data type 
CHAR (or CHARACTER), but made up of characters from different character 
sets. Each database has in effect two default character sets, the one 
used for CHAR and the one used for NCHAR. But you may also specify an 
explicit character set for a column as in NAME CHARACTER(100) CHARACTER 
SET UTF8. The character set used for CHAR can also be overridden per schema.

Thus, when you create a database, you should be able to specify that the 
default character set for CHAR columns be ASCII, and the character set 
used for NCHAR be UTF8.

Note also that according to the SQL standard, values of type CHAR but 
with different character sets are not generally comparable.

Each character set will also have a default collation. In a database 
with full SQL support for character sets and collations, you might use 
this to say that both CHAR and NCHAR store UTF16 characters, but that 
CHAR has a binary collation and NCHAR has a French collation.

SQL will also allow you to override a collation specification e.g. on an 
ORDER BY statement, and though not specified by the SQL standard, you 
might be able to create an index using a national ordering sequence.

Cheers,
Roy

Rick Hillegas (JIRA) wrote:
>     [ http://issues.apache.org/jira/browse/DERBY-533?page=comments#action_12319919 ] 
> 
> Rick Hillegas commented on DERBY-533:
> -------------------------------------
> 
> 1) There are some interesting issues here. Let's say that we re-enable these datatypes in 10.2. What happens when a client application selects from an NCHAR column under the following combinations? What should the ResultSetMetaData say the column is? Is the following reasonable?
> 
> 
> | NETWORK CLIENT | CLIENT PLATFORM | RESULT TYPE |
> |-----------------------------|-----------------------------|----------------------|
> | Derby 10.2                 |  jdk1.4                        |   NCHAR           |
> |-----------------------------|-----------------------------|----------------------|
> | Derby 10.2                 |  jdk1.6                        |   NCHAR           |
> |-----------------------------|-----------------------------|----------------------|
> | Derby 10.1                 |  jdk1.4                        |   CHAR              |
> |-----------------------------|-----------------------------|----------------------|
> | Derby 10.1                 |  jdk1.6                        |   CHAR              |
> |-----------------------------|-----------------------------|----------------------|
> | db2jcc                        |  jdk1.4                        |   CHAR              |
> |-----------------------------|-----------------------------|----------------------|
> | db2jcc                        |  jdk1.6                        |   CHAR              |
> |-----------------------------|-----------------------------|----------------------|
> 
> Since all of our string datatypes are represented as unicode, I think it is ok, as necessary, to implicitly cast CHAR to NCHAR going from client to server.
> 
> I also think it is reasonable to raise an exception if someone runs a 10.1 server against a 10.2 database.
> 
> 2) I don't see where the SQL standard addresses coercion between national strings and other types. Part 2 section 4.2.1 says that NATIONAL CHARACTER is "implementation defined". Part 2 section 6.12 lists legal and forbidden CASTS but says nothing about national string types. As always, I welcome being educated about what else might be relevant in the spec.
> 
> Oracle supports the following coercions but not the inverse coercions and Oracle documentation does not address localization issues:
> 
>    Datetime/Interval -> NCHAR/NVARCHAR2
>    Number -> NCHAR/NVARCHAR2
> 
> MySQL does not advertise any ability to cast to/from national strings.
> 
> DB2 and Postgres do not support national strings.
> 
> In summary, I do not see much guidance here. Derby's previous behavior seems reasonable to me: applying localization when coercing between national strings and other types.
> 
> 
> 
>>Re-enable national character datatypes
>>--------------------------------------
>>
>>         Key: DERBY-533
>>         URL: http://issues.apache.org/jira/browse/DERBY-533
>>     Project: Derby
>>        Type: New Feature
>>  Components: SQL
>>    Versions: 10.1.1.0
>>    Reporter: Rick Hillegas
> 
> 
>>SQL 2003 coyly defines national character types as "implementation defined". Accordingly, there is considerable variability in how these datatypes behave. Oracle and MySQL use these datatypes to store unicode strings. This would not distinguish national from non-national character types in Derby since Derby stores all strings as unicode sequences.
>>The national character datatypes (NCHAR, NVARCHAR, NCLOB and their synonymns) used to exist in Cloudscape but were disabled in Derby. The disabling comment in the grammar says "need to re-enable according to SQL standard". Does this mean that the types were removed because they chafed against SQL 2003? If so, what are their defects?
>>------------------------------------------------------------------
>>Cloudscape 3.5 provided the following support for national character types:
>>- NCHAR and NVARCHAR were legal datatypes.
>>- Ordering operations on these datatypes was determined by the collating sequence associated with the locale of the database.
>>- The locale was a DATABASE-wide property which could not be altered.
>>- Ordering on non-national character datatypes was lexicographic, that is, character by character.
>>------------------------------------------------------------------
>>Oracle 9i provides the following support for national character types:
>>- NCHAR, NVARCHAR2, and NCLOB datatypes are used to store unicode strings.
>>- Sort order can be overridden per SESSION or even per QUERY, which means that these overridden sort orders are not supported by indexes.
>>------------------------------------------------------------------
>>DB2 does not appear to support national character types. Nor does its DRDA data interchange protocol.
>>------------------------------------------------------------------
>>MySQL provides the following support for national character types:
>>- National Char and National Varchar datatypes are used to hold unicode strings. I cannot find a national CLOB type.
>>- The character set and sort order can be changed at SERVER-wide, TABLE-wide, or COLUMN-specific levels.
>>------------------------------------------------------------------
>>If we removed the disabling logic in Derby, I believe that the following would happen:
>>- We would get NCHAR, NVARCHAR, and NCLOB datatypes.
>>- These would sort according to the locale that was bound to the database when it was created.
>>- We would have to build DRDA transport support for these types.
>>The difference between national and non-national datatypes would be their sort order.
>>I am keenly interested in understanding what defects (other than DRDA support) should be addressed in the disabled implementation.
> 
>

[jira] Commented: (DERBY-533) Re-enable national character datatypes

Posted by "Rick Hillegas (JIRA)" <de...@db.apache.org>.

    [ http://issues.apache.org/jira/browse/DERBY-533?page=comments#action_12319919 ] 

Rick Hillegas commented on DERBY-533:
-------------------------------------

1) There are some interesting issues here. Let's say that we re-enable these datatypes in 10.2. What happens when a client application selects from an NCHAR column under the following combinations? What should the ResultSetMetaData say the column is? Is the following reasonable?


| NETWORK CLIENT | CLIENT PLATFORM | RESULT TYPE |
|-----------------------------|-----------------------------|----------------------|
| Derby 10.2                 |  jdk1.4                        |   NCHAR           |
|-----------------------------|-----------------------------|----------------------|
| Derby 10.2                 |  jdk1.6                        |   NCHAR           |
|-----------------------------|-----------------------------|----------------------|
| Derby 10.1                 |  jdk1.4                        |   CHAR              |
|-----------------------------|-----------------------------|----------------------|
| Derby 10.1                 |  jdk1.6                        |   CHAR              |
|-----------------------------|-----------------------------|----------------------|
| db2jcc                        |  jdk1.4                        |   CHAR              |
|-----------------------------|-----------------------------|----------------------|
| db2jcc                        |  jdk1.6                        |   CHAR              |
|-----------------------------|-----------------------------|----------------------|

Since all of our string datatypes are represented as unicode, I think it is ok, as necessary, to implicitly cast CHAR to NCHAR going from client to server.

I also think it is reasonable to raise an exception if someone runs a 10.1 server against a 10.2 database.

2) I don't see where the SQL standard addresses coercion between national strings and other types. Part 2 section 4.2.1 says that NATIONAL CHARACTER is "implementation defined". Part 2 section 6.12 lists legal and forbidden CASTS but says nothing about national string types. As always, I welcome being educated about what else might be relevant in the spec.

Oracle supports the following coercions but not the inverse coercions and Oracle documentation does not address localization issues:

   Datetime/Interval -> NCHAR/NVARCHAR2
   Number -> NCHAR/NVARCHAR2

MySQL does not advertise any ability to cast to/from national strings.

DB2 and Postgres do not support national strings.

In summary, I do not see much guidance here. Derby's previous behavior seems reasonable to me: applying localization when coercing between national strings and other types.


> Re-enable national character datatypes
> --------------------------------------
>
>          Key: DERBY-533
>          URL: http://issues.apache.org/jira/browse/DERBY-533
>      Project: Derby
>         Type: New Feature
>   Components: SQL
>     Versions: 10.1.1.0
>     Reporter: Rick Hillegas

>
> SQL 2003 coyly defines national character types as "implementation defined". Accordingly, there is considerable variability in how these datatypes behave. Oracle and MySQL use these datatypes to store unicode strings. This would not distinguish national from non-national character types in Derby since Derby stores all strings as unicode sequences.
> The national character datatypes (NCHAR, NVARCHAR, NCLOB and their synonymns) used to exist in Cloudscape but were disabled in Derby. The disabling comment in the grammar says "need to re-enable according to SQL standard". Does this mean that the types were removed because they chafed against SQL 2003? If so, what are their defects?
> ------------------------------------------------------------------
> Cloudscape 3.5 provided the following support for national character types:
> - NCHAR and NVARCHAR were legal datatypes.
> - Ordering operations on these datatypes was determined by the collating sequence associated with the locale of the database.
> - The locale was a DATABASE-wide property which could not be altered.
> - Ordering on non-national character datatypes was lexicographic, that is, character by character.
> ------------------------------------------------------------------
> Oracle 9i provides the following support for national character types:
> - NCHAR, NVARCHAR2, and NCLOB datatypes are used to store unicode strings.
> - Sort order can be overridden per SESSION or even per QUERY, which means that these overridden sort orders are not supported by indexes.
> ------------------------------------------------------------------
> DB2 does not appear to support national character types. Nor does its DRDA data interchange protocol.
> ------------------------------------------------------------------
> MySQL provides the following support for national character types:
> - National Char and National Varchar datatypes are used to hold unicode strings. I cannot find a national CLOB type.
> - The character set and sort order can be changed at SERVER-wide, TABLE-wide, or COLUMN-specific levels.
> ------------------------------------------------------------------
> If we removed the disabling logic in Derby, I believe that the following would happen:
> - We would get NCHAR, NVARCHAR, and NCLOB datatypes.
> - These would sort according to the locale that was bound to the database when it was created.
> - We would have to build DRDA transport support for these types.
> The difference between national and non-national datatypes would be their sort order.
> I am keenly interested in understanding what defects (other than DRDA support) should be addressed in the disabled implementation.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Closed: (DERBY-533) Re-enable national character datatypes

Posted by "Daniel John Debrunner (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel John Debrunner closed DERBY-533.
---------------------------------------

    Resolution: Won't Fix

>From the comments it seems the main need for these types was the locale specific ordering which is being implemented without national character types. Thus no plans to fix this. Though national character types could be added in the future, more or less as a synonym for the regular types.

> Re-enable national character datatypes
> --------------------------------------
>
>                 Key: DERBY-533
>                 URL: https://issues.apache.org/jira/browse/DERBY-533
>             Project: Derby
>          Issue Type: New Feature
>          Components: SQL
>    Affects Versions: 10.1.1.0
>            Reporter: Rick Hillegas
>
> SQL 2003 coyly defines national character types as "implementation defined". Accordingly, there is considerable variability in how these datatypes behave. Oracle and MySQL use these datatypes to store unicode strings. This would not distinguish national from non-national character types in Derby since Derby stores all strings as unicode sequences.
> The national character datatypes (NCHAR, NVARCHAR, NCLOB and their synonymns) used to exist in Cloudscape but were disabled in Derby. The disabling comment in the grammar says "need to re-enable according to SQL standard". Does this mean that the types were removed because they chafed against SQL 2003? If so, what are their defects?
> ------------------------------------------------------------------
> Cloudscape 3.5 provided the following support for national character types:
> - NCHAR and NVARCHAR were legal datatypes.
> - Ordering operations on these datatypes was determined by the collating sequence associated with the locale of the database.
> - The locale was a DATABASE-wide property which could not be altered.
> - Ordering on non-national character datatypes was lexicographic, that is, character by character.
> ------------------------------------------------------------------
> Oracle 9i provides the following support for national character types:
> - NCHAR, NVARCHAR2, and NCLOB datatypes are used to store unicode strings.
> - Sort order can be overridden per SESSION or even per QUERY, which means that these overridden sort orders are not supported by indexes.
> ------------------------------------------------------------------
> DB2 does not appear to support national character types. Nor does its DRDA data interchange protocol.
> ------------------------------------------------------------------
> MySQL provides the following support for national character types:
> - National Char and National Varchar datatypes are used to hold unicode strings. I cannot find a national CLOB type.
> - The character set and sort order can be changed at SERVER-wide, TABLE-wide, or COLUMN-specific levels.
> ------------------------------------------------------------------
> If we removed the disabling logic in Derby, I believe that the following would happen:
> - We would get NCHAR, NVARCHAR, and NCLOB datatypes.
> - These would sort according to the locale that was bound to the database when it was created.
> - We would have to build DRDA transport support for these types.
> The difference between national and non-national datatypes would be their sort order.
> I am keenly interested in understanding what defects (other than DRDA support) should be addressed in the disabled implementation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DERBY-533) Re-enable national character datatypes

Posted by "Rick Hillegas (JIRA)" <de...@db.apache.org>.

    [ http://issues.apache.org/jira/browse/DERBY-533?page=comments#action_12414786 ] 

Rick Hillegas commented on DERBY-533:
-------------------------------------

Hi Satheesh,

I'm looking forward to seeing your spec. I cooled on the idea of re-enabling these datatypes after it seemed to me that their real value was that they enabled locale-specific sort orders. The SQL language for collations seemed to me to be a more flexible (although considerably more expensive) solution.

I don't understand the implications of checking in these datatypes without building the corresponding JDBC4 support. I hadn't planned to build that support for 10.2. It may mean that we can't claim we're JDBC4 compliant.

> Re-enable national character datatypes
> --------------------------------------
>
>          Key: DERBY-533
>          URL: http://issues.apache.org/jira/browse/DERBY-533
>      Project: Derby
>         Type: New Feature

>   Components: SQL
>     Versions: 10.1.1.0
>     Reporter: Rick Hillegas

>
> SQL 2003 coyly defines national character types as "implementation defined". Accordingly, there is considerable variability in how these datatypes behave. Oracle and MySQL use these datatypes to store unicode strings. This would not distinguish national from non-national character types in Derby since Derby stores all strings as unicode sequences.
> The national character datatypes (NCHAR, NVARCHAR, NCLOB and their synonymns) used to exist in Cloudscape but were disabled in Derby. The disabling comment in the grammar says "need to re-enable according to SQL standard". Does this mean that the types were removed because they chafed against SQL 2003? If so, what are their defects?
> ------------------------------------------------------------------
> Cloudscape 3.5 provided the following support for national character types:
> - NCHAR and NVARCHAR were legal datatypes.
> - Ordering operations on these datatypes was determined by the collating sequence associated with the locale of the database.
> - The locale was a DATABASE-wide property which could not be altered.
> - Ordering on non-national character datatypes was lexicographic, that is, character by character.
> ------------------------------------------------------------------
> Oracle 9i provides the following support for national character types:
> - NCHAR, NVARCHAR2, and NCLOB datatypes are used to store unicode strings.
> - Sort order can be overridden per SESSION or even per QUERY, which means that these overridden sort orders are not supported by indexes.
> ------------------------------------------------------------------
> DB2 does not appear to support national character types. Nor does its DRDA data interchange protocol.
> ------------------------------------------------------------------
> MySQL provides the following support for national character types:
> - National Char and National Varchar datatypes are used to hold unicode strings. I cannot find a national CLOB type.
> - The character set and sort order can be changed at SERVER-wide, TABLE-wide, or COLUMN-specific levels.
> ------------------------------------------------------------------
> If we removed the disabling logic in Derby, I believe that the following would happen:
> - We would get NCHAR, NVARCHAR, and NCLOB datatypes.
> - These would sort according to the locale that was bound to the database when it was created.
> - We would have to build DRDA transport support for these types.
> The difference between national and non-national datatypes would be their sort order.
> I am keenly interested in understanding what defects (other than DRDA support) should be addressed in the disabled implementation.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Reopened: (DERBY-533) Re-enable national character datatypes

Posted by "Daniel John Debrunner (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel John Debrunner reopened DERBY-533:
-----------------------------------------


Opening it to change the resolution, since it was not fixed.

> Re-enable national character datatypes
> --------------------------------------
>
>                 Key: DERBY-533
>                 URL: https://issues.apache.org/jira/browse/DERBY-533
>             Project: Derby
>          Issue Type: New Feature
>          Components: SQL
>    Affects Versions: 10.1.1.0
>            Reporter: Rick Hillegas
>
> SQL 2003 coyly defines national character types as "implementation defined". Accordingly, there is considerable variability in how these datatypes behave. Oracle and MySQL use these datatypes to store unicode strings. This would not distinguish national from non-national character types in Derby since Derby stores all strings as unicode sequences.
> The national character datatypes (NCHAR, NVARCHAR, NCLOB and their synonymns) used to exist in Cloudscape but were disabled in Derby. The disabling comment in the grammar says "need to re-enable according to SQL standard". Does this mean that the types were removed because they chafed against SQL 2003? If so, what are their defects?
> ------------------------------------------------------------------
> Cloudscape 3.5 provided the following support for national character types:
> - NCHAR and NVARCHAR were legal datatypes.
> - Ordering operations on these datatypes was determined by the collating sequence associated with the locale of the database.
> - The locale was a DATABASE-wide property which could not be altered.
> - Ordering on non-national character datatypes was lexicographic, that is, character by character.
> ------------------------------------------------------------------
> Oracle 9i provides the following support for national character types:
> - NCHAR, NVARCHAR2, and NCLOB datatypes are used to store unicode strings.
> - Sort order can be overridden per SESSION or even per QUERY, which means that these overridden sort orders are not supported by indexes.
> ------------------------------------------------------------------
> DB2 does not appear to support national character types. Nor does its DRDA data interchange protocol.
> ------------------------------------------------------------------
> MySQL provides the following support for national character types:
> - National Char and National Varchar datatypes are used to hold unicode strings. I cannot find a national CLOB type.
> - The character set and sort order can be changed at SERVER-wide, TABLE-wide, or COLUMN-specific levels.
> ------------------------------------------------------------------
> If we removed the disabling logic in Derby, I believe that the following would happen:
> - We would get NCHAR, NVARCHAR, and NCLOB datatypes.
> - These would sort according to the locale that was bound to the database when it was created.
> - We would have to build DRDA transport support for these types.
> The difference between national and non-national datatypes would be their sort order.
> I am keenly interested in understanding what defects (other than DRDA support) should be addressed in the disabled implementation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DERBY-533) Re-enable national character datatypes

Posted by "Satheesh Bandaram (JIRA)" <de...@db.apache.org>.

    [ http://issues.apache.org/jira/browse/DERBY-533?page=comments#action_12414625 ] 

Satheesh Bandaram commented on DERBY-533:
-----------------------------------------

I have some interest in enabling National character types in Derby for 10.2. I am following all the previous discussions on this subject. My itch is primarily on the SQL side and I haven't decided whether to provide JDBC3 or JDBC4 API support. Since JDBC4 has added support for national character types and ability to set/get data of these types and added JDBC type constants for these.

I don't have itch to implement JDBC4 support, so, I am currently thinking of enabling national types and adding JDBC3 API support for these. I am still at very early stage and would like to submit a functional specification soon.



> Re-enable national character datatypes
> --------------------------------------
>
>          Key: DERBY-533
>          URL: http://issues.apache.org/jira/browse/DERBY-533
>      Project: Derby
>         Type: New Feature

>   Components: SQL
>     Versions: 10.1.1.0
>     Reporter: Rick Hillegas

>
> SQL 2003 coyly defines national character types as "implementation defined". Accordingly, there is considerable variability in how these datatypes behave. Oracle and MySQL use these datatypes to store unicode strings. This would not distinguish national from non-national character types in Derby since Derby stores all strings as unicode sequences.
> The national character datatypes (NCHAR, NVARCHAR, NCLOB and their synonymns) used to exist in Cloudscape but were disabled in Derby. The disabling comment in the grammar says "need to re-enable according to SQL standard". Does this mean that the types were removed because they chafed against SQL 2003? If so, what are their defects?
> ------------------------------------------------------------------
> Cloudscape 3.5 provided the following support for national character types:
> - NCHAR and NVARCHAR were legal datatypes.
> - Ordering operations on these datatypes was determined by the collating sequence associated with the locale of the database.
> - The locale was a DATABASE-wide property which could not be altered.
> - Ordering on non-national character datatypes was lexicographic, that is, character by character.
> ------------------------------------------------------------------
> Oracle 9i provides the following support for national character types:
> - NCHAR, NVARCHAR2, and NCLOB datatypes are used to store unicode strings.
> - Sort order can be overridden per SESSION or even per QUERY, which means that these overridden sort orders are not supported by indexes.
> ------------------------------------------------------------------
> DB2 does not appear to support national character types. Nor does its DRDA data interchange protocol.
> ------------------------------------------------------------------
> MySQL provides the following support for national character types:
> - National Char and National Varchar datatypes are used to hold unicode strings. I cannot find a national CLOB type.
> - The character set and sort order can be changed at SERVER-wide, TABLE-wide, or COLUMN-specific levels.
> ------------------------------------------------------------------
> If we removed the disabling logic in Derby, I believe that the following would happen:
> - We would get NCHAR, NVARCHAR, and NCLOB datatypes.
> - These would sort according to the locale that was bound to the database when it was created.
> - We would have to build DRDA transport support for these types.
> The difference between national and non-national datatypes would be their sort order.
> I am keenly interested in understanding what defects (other than DRDA support) should be addressed in the disabled implementation.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Closed: (DERBY-533) Re-enable national character datatypes

Posted by "Rick Hillegas (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rick Hillegas closed DERBY-533.
-------------------------------

    Resolution: Fixed

Closing this issue. I think that DERBY-1478 is a better solution to this problem.

> Re-enable national character datatypes
> --------------------------------------
>
>                 Key: DERBY-533
>                 URL: https://issues.apache.org/jira/browse/DERBY-533
>             Project: Derby
>          Issue Type: New Feature
>          Components: SQL
>    Affects Versions: 10.1.1.0
>            Reporter: Rick Hillegas
>
> SQL 2003 coyly defines national character types as "implementation defined". Accordingly, there is considerable variability in how these datatypes behave. Oracle and MySQL use these datatypes to store unicode strings. This would not distinguish national from non-national character types in Derby since Derby stores all strings as unicode sequences.
> The national character datatypes (NCHAR, NVARCHAR, NCLOB and their synonymns) used to exist in Cloudscape but were disabled in Derby. The disabling comment in the grammar says "need to re-enable according to SQL standard". Does this mean that the types were removed because they chafed against SQL 2003? If so, what are their defects?
> ------------------------------------------------------------------
> Cloudscape 3.5 provided the following support for national character types:
> - NCHAR and NVARCHAR were legal datatypes.
> - Ordering operations on these datatypes was determined by the collating sequence associated with the locale of the database.
> - The locale was a DATABASE-wide property which could not be altered.
> - Ordering on non-national character datatypes was lexicographic, that is, character by character.
> ------------------------------------------------------------------
> Oracle 9i provides the following support for national character types:
> - NCHAR, NVARCHAR2, and NCLOB datatypes are used to store unicode strings.
> - Sort order can be overridden per SESSION or even per QUERY, which means that these overridden sort orders are not supported by indexes.
> ------------------------------------------------------------------
> DB2 does not appear to support national character types. Nor does its DRDA data interchange protocol.
> ------------------------------------------------------------------
> MySQL provides the following support for national character types:
> - National Char and National Varchar datatypes are used to hold unicode strings. I cannot find a national CLOB type.
> - The character set and sort order can be changed at SERVER-wide, TABLE-wide, or COLUMN-specific levels.
> ------------------------------------------------------------------
> If we removed the disabling logic in Derby, I believe that the following would happen:
> - We would get NCHAR, NVARCHAR, and NCLOB datatypes.
> - These would sort according to the locale that was bound to the database when it was created.
> - We would have to build DRDA transport support for these types.
> The difference between national and non-national datatypes would be their sort order.
> I am keenly interested in understanding what defects (other than DRDA support) should be addressed in the disabled implementation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DERBY-533) Re-enable national character datatypes

Posted by "Daniel John Debrunner (JIRA)" <de...@db.apache.org>.

    [ http://issues.apache.org/jira/browse/DERBY-533?page=comments#action_12319893 ] 

Daniel John Debrunner commented on DERBY-533:
---------------------------------------------

I can think of two issues.

1) JDBC constants (java.sql.Types) do not exist for these types until JDBC 4.0.

2) In the existing code, conversions of national character types to and from datetime and number types applied localization.
I don't know if this approach is correct with respect to the SQL standard, or in-line with other databases. Ensuring conversions
are correct before allow applications to use them  was the reason for disabling them. Didn't want to get into some backwards
compatibility issues by changing the behaviour to match the standard. I don't think anyone had time to resolve this.
An example would be converting a DECIMAL of 1.2, to VARCHAR the conversion is 1.2, but in a French database the conversion
to NATIONAL VARCHAR would be 1,2. Is this correct? Similar for conversions from character types, should 1,2 be converted
to 1.2 in French from a NATIONAL VARCHAR? This is from memory, I know this is true for the datetime types, but  maybe isn't
true for the numeric types (but they are easier to explain and come up with simple examples :-).
          

> Re-enable national character datatypes
> --------------------------------------
>
>          Key: DERBY-533
>          URL: http://issues.apache.org/jira/browse/DERBY-533
>      Project: Derby
>         Type: New Feature
>   Components: SQL
>     Versions: 10.1.1.0
>     Reporter: Rick Hillegas

>
> SQL 2003 coyly defines national character types as "implementation defined". Accordingly, there is considerable variability in how these datatypes behave. Oracle and MySQL use these datatypes to store unicode strings. This would not distinguish national from non-national character types in Derby since Derby stores all strings as unicode sequences.
> The national character datatypes (NCHAR, NVARCHAR, NCLOB and their synonymns) used to exist in Cloudscape but were disabled in Derby. The disabling comment in the grammar says "need to re-enable according to SQL standard". Does this mean that the types were removed because they chafed against SQL 2003? If so, what are their defects?
> ------------------------------------------------------------------
> Cloudscape 3.5 provided the following support for national character types:
> - NCHAR and NVARCHAR were legal datatypes.
> - Ordering operations on these datatypes was determined by the collating sequence associated with the locale of the database.
> - The locale was a DATABASE-wide property which could not be altered.
> - Ordering on non-national character datatypes was lexicographic, that is, character by character.
> ------------------------------------------------------------------
> Oracle 9i provides the following support for national character types:
> - NCHAR, NVARCHAR2, and NCLOB datatypes are used to store unicode strings.
> - Sort order can be overridden per SESSION or even per QUERY, which means that these overridden sort orders are not supported by indexes.
> ------------------------------------------------------------------
> DB2 does not appear to support national character types. Nor does its DRDA data interchange protocol.
> ------------------------------------------------------------------
> MySQL provides the following support for national character types:
> - National Char and National Varchar datatypes are used to hold unicode strings. I cannot find a national CLOB type.
> - The character set and sort order can be changed at SERVER-wide, TABLE-wide, or COLUMN-specific levels.
> ------------------------------------------------------------------
> If we removed the disabling logic in Derby, I believe that the following would happen:
> - We would get NCHAR, NVARCHAR, and NCLOB datatypes.
> - These would sort according to the locale that was bound to the database when it was created.
> - We would have to build DRDA transport support for these types.
> The difference between national and non-national datatypes would be their sort order.
> I am keenly interested in understanding what defects (other than DRDA support) should be addressed in the disabled implementation.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Commented: (DERBY-533) Re-enable national character datatypes

Posted by "Satheesh Bandaram (JIRA)" <de...@db.apache.org>.

    [ http://issues.apache.org/jira/browse/DERBY-533?page=comments#action_12420746 ] 

Satheesh Bandaram commented on DERBY-533:
-----------------------------------------

I earlier had some interest in enabling national characters for 10.2, but after dates for 10.2 became clearer, I decided it wouldn't be feasible to research and implement this functionality in the remaining time. I will add a comment to JIRA entry stating this. I also don't have itch to look into this issue for 10.3, though I think this is very useful functionality.

> Re-enable national character datatypes
> --------------------------------------
>
>          Key: DERBY-533
>          URL: http://issues.apache.org/jira/browse/DERBY-533
>      Project: Derby
>         Type: New Feature

>   Components: SQL
>     Versions: 10.1.1.0
>     Reporter: Rick Hillegas

>
> SQL 2003 coyly defines national character types as "implementation defined". Accordingly, there is considerable variability in how these datatypes behave. Oracle and MySQL use these datatypes to store unicode strings. This would not distinguish national from non-national character types in Derby since Derby stores all strings as unicode sequences.
> The national character datatypes (NCHAR, NVARCHAR, NCLOB and their synonymns) used to exist in Cloudscape but were disabled in Derby. The disabling comment in the grammar says "need to re-enable according to SQL standard". Does this mean that the types were removed because they chafed against SQL 2003? If so, what are their defects?
> ------------------------------------------------------------------
> Cloudscape 3.5 provided the following support for national character types:
> - NCHAR and NVARCHAR were legal datatypes.
> - Ordering operations on these datatypes was determined by the collating sequence associated with the locale of the database.
> - The locale was a DATABASE-wide property which could not be altered.
> - Ordering on non-national character datatypes was lexicographic, that is, character by character.
> ------------------------------------------------------------------
> Oracle 9i provides the following support for national character types:
> - NCHAR, NVARCHAR2, and NCLOB datatypes are used to store unicode strings.
> - Sort order can be overridden per SESSION or even per QUERY, which means that these overridden sort orders are not supported by indexes.
> ------------------------------------------------------------------
> DB2 does not appear to support national character types. Nor does its DRDA data interchange protocol.
> ------------------------------------------------------------------
> MySQL provides the following support for national character types:
> - National Char and National Varchar datatypes are used to hold unicode strings. I cannot find a national CLOB type.
> - The character set and sort order can be changed at SERVER-wide, TABLE-wide, or COLUMN-specific levels.
> ------------------------------------------------------------------
> If we removed the disabling logic in Derby, I believe that the following would happen:
> - We would get NCHAR, NVARCHAR, and NCLOB datatypes.
> - These would sort according to the locale that was bound to the database when it was created.
> - We would have to build DRDA transport support for these types.
> The difference between national and non-national datatypes would be their sort order.
> I am keenly interested in understanding what defects (other than DRDA support) should be addressed in the disabled implementation.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira