You are viewing a plain text version of this content. The canonical link for it is here.
Posted to ojb-user@db.apache.org by Guillaume Laforge <gl...@gmail.com> on 2005/07/28 16:01:07 UTC

Stroring internationalized strings in Oracle9i with OJB

Hello,

On my current project, I'm using Oracle9i and OJB. The version of OJB
is rather old: that's 1.0.0, but I may and will certainly upgrade to
1.0.3 if that's needed.

I'm storing sentences in 40 different languages (English, French,
Japanese, Chinese, etc...).

All the columns of my tables which are supposed to be holding some
i18n content are of type NVARCHAR2 or NCHAR -- i.e. I'm using
Oracle9i's specific types for storing Unicode characters.

My problem is that when I try to store and then retrieve labels with
non-latin characters, I don't get the same string. In fact, the label
is badly stored in the database, and badly encoded.

Hence the question: How can I manage to configure OJB, or perhaps my
datasource, or my mapping, to make OJB store java Unicode strings
correctly?

Brian told me there may be some specific means for altering all
statements for a give data type. Is there some documentation somewhere
I could read? Oracle allows to use a specific pstmt.setFormOfUse(1,
OraclePreparedStatement.FORM_NCHAR) which is an extension to JDBC
unfortunately.

To give some more background, here are some more specific details on
my Oracle9i configuration:

SQL> select * from nls_database_parameters;

PARAMETER                   		VALUE
------------------------------ 	--------------------------
NLS_LANGUAGE		FRENCH
NLS_TERRITORY		FRANCE
NLS_CURRENCY		? (ca doit être le caractère euro, mais le DOS ne le
supporte pas à l'affichage)
NLS_ISO_CURRENCY		FRANCE
NLS_NUMERIC_CHARACTERS         ,.
NLS_CHARACTERSET	WE8ISO8859P15
NLS_CALENDAR		GREGORIAN
NLS_DATE_FORMAT		DD/MM/YYYY
NLS_DATE_LANGUAGE	FRENCH
NLS_SORT			FRENCH
NLS_TIME_FORMAT		HH24:MI:SSXFF

PARAMETER                      	VALUE
------------------------------ 	--------------------------
NLS_TIMESTAMP_FORMAT		DD/MM/RR HH24:MI:SSXFF
NLS_TIME_TZ_FORMAT		HH24:MI:SSXFF TZR
NLS_TIMESTAMP_TZ_FORMAT	DD/MM/RR HH24:MI:SSXFF TZR
NLS_DUAL_CURRENCY		? (encore le caractère euro)
NLS_COMP				BINARY
NLS_LENGTH_SEMANTICS		BYTE
NLS_NCHAR_CONV_EXCP 		FALSE
NLS_NCHAR_CHARACTERSET	AL16UTF16
NLS_RDBMS_VERSION		9.2.0.4.0

As you can see, my default encoding is the same as the system it's
installed on, and it's by default WE8ISO8859P15. And with that default
charset, Oracle treats all String inserts as if the String was a local
String using this encoding, though the NCHAR type has a 16-bit Unicode
charset.

Oracle9i provides a way to specifically tell the JDBC driver to encode
a String as a Unicode string for the Unicode data types (NCLOB, NCHAR,
NVARCHAR2), and you can do something like:

PreparedStatement statementInsert = cnx.prepareStatement(query);
((OraclePreparedStatement)statementInsert).setFormOfUse(1,
OraclePreparedStatement.FORM_NCHAR);

This solution works, and I can store and retrieve Unicode content.
But obviously, that's not very clean... especially because it uses a
specific Oracle method on its specific implementation of
PreparedStatement.
But well, if that works in JDBC, perhaps there could be a way to do
that with OJB? Some configuration / customisation?

Has anyone encountered that problem and workarounded that it?

Thanks in advance for all the tips or potential solutions.

-- 
Guillaume Laforge
http://glaforge.free.fr/weblog/?catid=2

---------------------------------------------------------------------
To unsubscribe, e-mail: ojb-user-unsubscribe@db.apache.org
For additional commands, e-mail: ojb-user-help@db.apache.org


Re: Stroring internationalized strings in Oracle9i with OJB

Posted by Guillaume Laforge <gl...@gmail.com>.
Hi,

On 28/07/05, Bobby Lawrence <ro...@jlab.org> wrote:
> I don't think OJB has any connection-specific configuration, but you can
> try this code on your insert:
> 
> ClassDescriptor classDescriptor = broker.getClassDescriptor(YourClass.class);
> PreparedStatement statementInsert = broker.serviceStatementManager().getInsertStatement(classDescriptor);
> ((OraclePreparedStatement)statementInsert).setFormOfUse(1, OraclePreparedStatement.FORM_NCHAR);
> broker.serviceStatementManager().bindInsert(statementInsert , classDescriptor, yourObject);
> broker.store(yourObject);
> 
> Obviously, you can change it slightly for an update.

Thanks a lot for your tip! 
It looks like OJB is pretty neatly configurable for those nasty little
issues ;-)
I haven't tested that tip though, but I'm pretty sure it works great.

Instead, I've just asked my DBA to change my database's configuration
to use UTF-8 as its default charset, so that the Oracle thin driver is
able to talk the same speak with the database, without any transcoding
issue.
And now, my code works perfectly well without a change or a particular
Database specific snippet of code.

In conclusion, for those on the list who have a problem with Oracle9i
(+OJB) for storing and retrieving NCHAR/NVARCHAR2/NCLOB columns
containing localized strings, make sure the "NLS_CHARACTERSET"
variable is set to "UTF8" (not an 8-bit charset) and you should be
safe.

-- 
Guillaume Laforge
http://glaforge.free.fr/weblog/?catid=2

---------------------------------------------------------------------
To unsubscribe, e-mail: ojb-user-unsubscribe@db.apache.org
For additional commands, e-mail: ojb-user-help@db.apache.org


Re: Stroring internationalized strings in Oracle9i with OJB

Posted by Guillaume Laforge <gl...@gmail.com>.
On 29/07/05, Thomas Dudziak <to...@gmail.com> wrote:
> Ok, in that case I think it is better if I simply create a FAQ entry.

Right, a FAQ entry is probably more appropriate.
Thanks everybody for your help.


-- 
Guillaume Laforge
http://glaforge.free.fr/weblog/?catid=2

---------------------------------------------------------------------
To unsubscribe, e-mail: ojb-user-unsubscribe@db.apache.org
For additional commands, e-mail: ojb-user-help@db.apache.org


Re: Stroring internationalized strings in Oracle9i with OJB

Posted by Thomas Dudziak <to...@gmail.com>.
On 7/29/05, Guillaume Laforge <gl...@gmail.com> wrote:
> Hi Tom!
> 
> Well, yes and no. Should I create an issue or not?
> Since the problem can be considered a database configuration issue?
> Simply setting the NLS_CHARACTERSET to UTF8 makes the JDBC driver and
> the Oracle instance to talk together flawlessly. So should the Oracle
> platform support use the OraclePreparedStatement.setForUse() trick?
> If yes, yeah, I can create an issue for that.
> The setForUse() trick (+Bobby's solution) works if your default
> charset is an 8-bit charset, but if you're using UTF8, it's not
> needed.

Ok, in that case I think it is better if I simply create a FAQ entry.

regards,
Tom

---------------------------------------------------------------------
To unsubscribe, e-mail: ojb-user-unsubscribe@db.apache.org
For additional commands, e-mail: ojb-user-help@db.apache.org


Re: Stroring internationalized strings in Oracle9i with OJB

Posted by Guillaume Laforge <gl...@gmail.com>.
Hi Tom!

On 28/07/05, Thomas Dudziak <to...@gmail.com> wrote:
> Bobby is right, Oracle does not seem to be able to handle NCHAR
> directly in the JDBC driver; you are required to use the
> Oracle-specific extension.
> This might be a useful enhancement in the Oracle platform ...
> Guillaume, could you raise a JIRA issue for this ?

Well, yes and no. Should I create an issue or not?
Since the problem can be considered a database configuration issue?
Simply setting the NLS_CHARACTERSET to UTF8 makes the JDBC driver and
the Oracle instance to talk together flawlessly. So should the Oracle
platform support use the OraclePreparedStatement.setForUse() trick?
If yes, yeah, I can create an issue for that.
The setForUse() trick (+Bobby's solution) works if your default
charset is an 8-bit charset, but if you're using UTF8, it's not
needed.

-- 
Guillaume Laforge
http://glaforge.free.fr/weblog/?catid=2

---------------------------------------------------------------------
To unsubscribe, e-mail: ojb-user-unsubscribe@db.apache.org
For additional commands, e-mail: ojb-user-help@db.apache.org


Re: Stroring internationalized strings in Oracle9i with OJB

Posted by Thomas Dudziak <to...@gmail.com>.
Bobby is right, Oracle does not seem to be able to handle NCHAR
directly in the JDBC driver; you are required to use the
Oracle-specific extension.
This might be a useful enhancement in the Oracle platform ...
Guillaume, could you raise a JIRA issue for this ?

Tom

---------------------------------------------------------------------
To unsubscribe, e-mail: ojb-user-unsubscribe@db.apache.org
For additional commands, e-mail: ojb-user-help@db.apache.org


Re: Stroring internationalized strings in Oracle9i with OJB

Posted by Bobby Lawrence <ro...@jlab.org>.
Guillaume -
I don't think OJB has any connection-specific configuration, but you can 
try this code on your insert:

ClassDescriptor classDescriptor = broker.getClassDescriptor(YourClass.class);
PreparedStatement statementInsert = broker.serviceStatementManager().getInsertStatement(classDescriptor);
((OraclePreparedStatement)statementInsert).setFormOfUse(1, OraclePreparedStatement.FORM_NCHAR);
broker.serviceStatementManager().bindInsert(statementInsert , classDescriptor, yourObject);
broker.store(yourObject);

Obviously, you can change it slightly for an update.
--Bobby


Guillaume Laforge wrote:

>I forgot to mention that I'm using Oracle9i's thin driver and its ojdbc14.jar.
>With a version 9.2.0.4 of Oracle9i.
>
>
>On 28/07/05, Guillaume Laforge <gl...@gmail.com> wrote:
>  
>
>>Hello,
>>
>>On my current project, I'm using Oracle9i and OJB. The version of OJB
>>is rather old: that's 1.0.0, but I may and will certainly upgrade to
>>1.0.3 if that's needed.
>>
>>I'm storing sentences in 40 different languages (English, French,
>>Japanese, Chinese, etc...).
>>
>>All the columns of my tables which are supposed to be holding some
>>i18n content are of type NVARCHAR2 or NCHAR -- i.e. I'm using
>>Oracle9i's specific types for storing Unicode characters.
>>
>>My problem is that when I try to store and then retrieve labels with
>>non-latin characters, I don't get the same string. In fact, the label
>>is badly stored in the database, and badly encoded.
>>
>>Hence the question: How can I manage to configure OJB, or perhaps my
>>datasource, or my mapping, to make OJB store java Unicode strings
>>correctly?
>>
>>Brian told me there may be some specific means for altering all
>>statements for a give data type. Is there some documentation somewhere
>>I could read? Oracle allows to use a specific pstmt.setFormOfUse(1,
>>OraclePreparedStatement.FORM_NCHAR) which is an extension to JDBC
>>unfortunately.
>>
>>To give some more background, here are some more specific details on
>>my Oracle9i configuration:
>>
>>SQL> select * from nls_database_parameters;
>>
>>PARAMETER                               VALUE
>>------------------------------  --------------------------
>>NLS_LANGUAGE            FRENCH
>>NLS_TERRITORY           FRANCE
>>NLS_CURRENCY            ? (ca doit être le caractère euro, mais le DOS ne le
>>supporte pas à l'affichage)
>>NLS_ISO_CURRENCY                FRANCE
>>NLS_NUMERIC_CHARACTERS         ,.
>>NLS_CHARACTERSET        WE8ISO8859P15
>>NLS_CALENDAR            GREGORIAN
>>NLS_DATE_FORMAT         DD/MM/YYYY
>>NLS_DATE_LANGUAGE       FRENCH
>>NLS_SORT                        FRENCH
>>NLS_TIME_FORMAT         HH24:MI:SSXFF
>>
>>PARAMETER                       VALUE
>>------------------------------  --------------------------
>>NLS_TIMESTAMP_FORMAT            DD/MM/RR HH24:MI:SSXFF
>>NLS_TIME_TZ_FORMAT              HH24:MI:SSXFF TZR
>>NLS_TIMESTAMP_TZ_FORMAT DD/MM/RR HH24:MI:SSXFF TZR
>>NLS_DUAL_CURRENCY               ? (encore le caractère euro)
>>NLS_COMP                                BINARY
>>NLS_LENGTH_SEMANTICS            BYTE
>>NLS_NCHAR_CONV_EXCP             FALSE
>>NLS_NCHAR_CHARACTERSET  AL16UTF16
>>NLS_RDBMS_VERSION               9.2.0.4.0
>>
>>As you can see, my default encoding is the same as the system it's
>>installed on, and it's by default WE8ISO8859P15. And with that default
>>charset, Oracle treats all String inserts as if the String was a local
>>String using this encoding, though the NCHAR type has a 16-bit Unicode
>>charset.
>>
>>Oracle9i provides a way to specifically tell the JDBC driver to encode
>>a String as a Unicode string for the Unicode data types (NCLOB, NCHAR,
>>NVARCHAR2), and you can do something like:
>>
>>PreparedStatement statementInsert = cnx.prepareStatement(query);
>>((OraclePreparedStatement)statementInsert).setFormOfUse(1,
>>OraclePreparedStatement.FORM_NCHAR);
>>
>>This solution works, and I can store and retrieve Unicode content.
>>But obviously, that's not very clean... especially because it uses a
>>specific Oracle method on its specific implementation of
>>PreparedStatement.
>>But well, if that works in JDBC, perhaps there could be a way to do
>>that with OJB? Some configuration / customisation?
>>
>>Has anyone encountered that problem and workarounded that it?
>>
>>Thanks in advance for all the tips or potential solutions.
>>    
>>
>
>
>
>  
>

-- 
----------------------------
Bobby Lawrence
MIS Application Developer

Jefferson Lab (www.jlab.org)

 Email: robertl@jlab.org
Office: (757) 269-5818
 Pager: (757) 584-5818
----------------------------



Re: Stroring internationalized strings in Oracle9i with OJB

Posted by Guillaume Laforge <gl...@gmail.com>.
I forgot to mention that I'm using Oracle9i's thin driver and its ojdbc14.jar.
With a version 9.2.0.4 of Oracle9i.


On 28/07/05, Guillaume Laforge <gl...@gmail.com> wrote:
> Hello,
> 
> On my current project, I'm using Oracle9i and OJB. The version of OJB
> is rather old: that's 1.0.0, but I may and will certainly upgrade to
> 1.0.3 if that's needed.
> 
> I'm storing sentences in 40 different languages (English, French,
> Japanese, Chinese, etc...).
> 
> All the columns of my tables which are supposed to be holding some
> i18n content are of type NVARCHAR2 or NCHAR -- i.e. I'm using
> Oracle9i's specific types for storing Unicode characters.
> 
> My problem is that when I try to store and then retrieve labels with
> non-latin characters, I don't get the same string. In fact, the label
> is badly stored in the database, and badly encoded.
> 
> Hence the question: How can I manage to configure OJB, or perhaps my
> datasource, or my mapping, to make OJB store java Unicode strings
> correctly?
> 
> Brian told me there may be some specific means for altering all
> statements for a give data type. Is there some documentation somewhere
> I could read? Oracle allows to use a specific pstmt.setFormOfUse(1,
> OraclePreparedStatement.FORM_NCHAR) which is an extension to JDBC
> unfortunately.
> 
> To give some more background, here are some more specific details on
> my Oracle9i configuration:
> 
> SQL> select * from nls_database_parameters;
> 
> PARAMETER                               VALUE
> ------------------------------  --------------------------
> NLS_LANGUAGE            FRENCH
> NLS_TERRITORY           FRANCE
> NLS_CURRENCY            ? (ca doit être le caractère euro, mais le DOS ne le
> supporte pas à l'affichage)
> NLS_ISO_CURRENCY                FRANCE
> NLS_NUMERIC_CHARACTERS         ,.
> NLS_CHARACTERSET        WE8ISO8859P15
> NLS_CALENDAR            GREGORIAN
> NLS_DATE_FORMAT         DD/MM/YYYY
> NLS_DATE_LANGUAGE       FRENCH
> NLS_SORT                        FRENCH
> NLS_TIME_FORMAT         HH24:MI:SSXFF
> 
> PARAMETER                       VALUE
> ------------------------------  --------------------------
> NLS_TIMESTAMP_FORMAT            DD/MM/RR HH24:MI:SSXFF
> NLS_TIME_TZ_FORMAT              HH24:MI:SSXFF TZR
> NLS_TIMESTAMP_TZ_FORMAT DD/MM/RR HH24:MI:SSXFF TZR
> NLS_DUAL_CURRENCY               ? (encore le caractère euro)
> NLS_COMP                                BINARY
> NLS_LENGTH_SEMANTICS            BYTE
> NLS_NCHAR_CONV_EXCP             FALSE
> NLS_NCHAR_CHARACTERSET  AL16UTF16
> NLS_RDBMS_VERSION               9.2.0.4.0
> 
> As you can see, my default encoding is the same as the system it's
> installed on, and it's by default WE8ISO8859P15. And with that default
> charset, Oracle treats all String inserts as if the String was a local
> String using this encoding, though the NCHAR type has a 16-bit Unicode
> charset.
> 
> Oracle9i provides a way to specifically tell the JDBC driver to encode
> a String as a Unicode string for the Unicode data types (NCLOB, NCHAR,
> NVARCHAR2), and you can do something like:
> 
> PreparedStatement statementInsert = cnx.prepareStatement(query);
> ((OraclePreparedStatement)statementInsert).setFormOfUse(1,
> OraclePreparedStatement.FORM_NCHAR);
> 
> This solution works, and I can store and retrieve Unicode content.
> But obviously, that's not very clean... especially because it uses a
> specific Oracle method on its specific implementation of
> PreparedStatement.
> But well, if that works in JDBC, perhaps there could be a way to do
> that with OJB? Some configuration / customisation?
> 
> Has anyone encountered that problem and workarounded that it?
> 
> Thanks in advance for all the tips or potential solutions.



-- 
Guillaume Laforge
http://glaforge.free.fr/weblog/?catid=2

---------------------------------------------------------------------
To unsubscribe, e-mail: ojb-user-unsubscribe@db.apache.org
For additional commands, e-mail: ojb-user-help@db.apache.org