You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cocoon.apache.org by Udi Weinsberg <ud...@tochna.technion.ac.il> on 2001/09/27 15:20:01 UTC

Language Support in Request Parameters

C2, WNT (hebrew enabled), Tomcat3.2.3, MySQL, IE5.5

Hey!

I'm trying to write an application that uses hebrew in forms (meaning that
the user can insert hebrew chars into form elements, mainly input boxes).
I guess that the problem is the same in any language which is encoded into
special html chars.

I ran a simple application in tomcat (as a simple servlet) and in cocoon2,
which simply takes the data you entered in an input box (in hebrew), place
it into a request parameter and then displays the request parameter from a
different page.

In the post message, I saw that explorer is coding my chars correct:
POST .....
host: ...

UserName: %E0%D9%E3....

When I ran the application on a Tomcat servlet - the results were good. I
saw the exact (hebrew) chars that I've written before.

On the C2, however, the parameter did not show up correctly, and was coded
differently.

The problem is greater when I try to insert data into MySQL db (which
expects the normal %XX encoding) and get garbage there as well.


Did anyone use C2 in an html-encoded language? Can you tell me what I need
to do to make it work?
Why is Tomcat working and C2 not? Where is the translation being
preformed?

Thanks,
Udi.



---------------------------------------------------------------------
Please check that your question has not already been answered in the
FAQ before posting. <http://xml.apache.org/cocoon/faqs.html>

To unsubscribe, e-mail: <co...@xml.apache.org>
For additional commands, e-mail: <co...@xml.apache.org>


Re: Language Support in Request Parameters

Posted by Udi Weinsberg <ud...@tochna.technion.ac.il>.

On Fri, 28 Sep 2001, Piroumian, Konstantin wrote:

> > <skipped>
> >
> > This basically means, that if it finds a %CC encoded char, it simply
> > translates the CC into it's CHAR equivalent, and appends it to the
> > resulting string. Isn't this right?? It seeems perfectly right, since the
> > DB works with theexact same hex values, and the only way to pass chars is
> > using their BYTE value (0-255). I really don't understand what I am
> > missing here.
> >
> > I am using the DBAddAction to add data to the database.My pipe line is
> > quite simple and looks like:
> > AddPatient.xsp (has form) -> DBAddAction (using the request arguments) ->
> > ShowPatient.xsp (show the details by querying from the DB).
>
> What I could get from your email is that you can get a correct parameter,
> store it in a database and then show it. So, where's the problem? Maybe you
> should try to URLEncode/URLDecode your params before sending/after
> receiving? It's Friday evening and quite difficult to understand. ;)

I wish! Can't get the correct parameter! It inserts garbage into the DB
and shows the same garbage later on. I would have tried to Encode/Decode,
but I am using the AddAction, which is supplied with C2. I guess I'll have
to start mess with the code.

>
> >
> > The problem is not in the DB or JDBC driver since I am able to retrieve
> > data from the db and insert it to a param argument. Perhaps I'll try to
> > see how this variable is set (as a simple string, in which base?).
>
> Where do you set a variable? In XSP to a Java variable or maybe some other
> way (sitemap, action, stylesheet)? Maybe the problem is in your own code?
>

The variable is set for me. The C2, once it gets the Request builds up the
HttpEnvironment object with the parameters parsed in HttpRequest object.
Then this is passed to the DBAddAction (not my code) and off to the
database (only it does not work!). I doubt that the problem is in my code,
since I did not write any code ;-)

> >
> > An example (regardless of your i18) will be great. Btw, did you documented
> > the 'iw <-> he' problem I sent you?
>
> Sorry, not yet, but I remember about it.
> Unfortunately, the project I am working on has very little Cocoon relation
> now, and I have no much time to support i18n now. I have several things to
> do: add your note, correct Polish translation, improve samples, etc. but I
> can't predict when it'll happen.
>

Indeed - time is a problem. Friday night - let's call it a day! :-)

> >
> > If you are listed in the cocoon-dev list, perhaps you can post the problem
> > there. Let me know!
> >

Forget this - I already posted the prbolem in the dev group. let's see
what the experts have to say about this :-)


UDi.



---------------------------------------------------------------------
Please check that your question has not already been answered in the
FAQ before posting. <http://xml.apache.org/cocoon/faqs.html>

To unsubscribe, e-mail: <co...@xml.apache.org>
For additional commands, e-mail: <co...@xml.apache.org>


Re: Language Support in Request Parameters

Posted by "Piroumian, Konstantin" <KP...@flagship.ru>.
> <skipped>
>
> This basically means, that if it finds a %CC encoded char, it simply
> translates the CC into it's CHAR equivalent, and appends it to the
> resulting string. Isn't this right?? It seeems perfectly right, since the
> DB works with the exact same hex values, and the only way to pass chars is
> using their BYTE value (0-255). I really don't understand what I am
> missing here.
>
> I am using the DBAddAction to add data to the database.  My pipe line is
> quite simple and looks like:
> AddPatient.xsp (has form) -> DBAddAction (using the request arguments) ->
> ShowPatient.xsp (show the details by querying from the DB).

What I could get from your email is that you can get a correct parameter,
store it in a database and then show it. So, where's the problem? Maybe you
should try to URLEncode/URLDecode your params before sending/after
receiving? It's Friday evening and quite difficult to understand. ;)

>
> The problem is not in the DB or JDBC driver since I am able to retrieve
> data from the db and insert it to a param argument. Perhaps I'll try to
> see how this variable is set (as a simple string, in which base?).

Where do you set a variable? In XSP to a Java variable or maybe some other
way (sitemap, action, stylesheet)? Maybe the problem is in your own code?

>
> An example (regardless of your i18) will be great. Btw, did you documented
> the 'iw <-> he' problem I sent you?

Sorry, not yet, but I remember about it.
Unfortunately, the project I am working on has very little Cocoon relation
now, and I have no much time to support i18n now. I have several things to
do: add your note, correct Polish translation, improve samples, etc. but I
can't predict when it'll happen.

>
> If you are listed in the cocoon-dev list, perhaps you can post the problem
> there. Let me know!
>
>
> Thanks,
> Udi.
>
>
> On Fri, 28 Sep 2001, Piroumian, Konstantin wrote:
>
> >  Hi!
> > >
> > > You said you had the same problem in Oracle. I'm using MySQL with the
MM
> > > JDBC driver, which expects the normal chars encoding (%xx). Even if I
use
> > > the serializer encoding (which I am about to try now), it will not do
the
> > > trick, it will only try to overcome an inherent problem (if it will
work
> > > at all):
> > >
> > > The serializer is the output's last pipe's stage, and it seems that
the
> > > problem is somewhere in the input pipe (meaning from the user to the
> > > server and within the server).I can even see that the Log file has
> > > incorrect data.
> > >
> > > An interesting point is that when I retrieve data from the database
into a
> > > session argument (using the DBAuthAction), the data is inserted
correctly
> > > - meaning that I see it proper in theLog file and in the resulting
page
> > > (both in plain hebrew). As I looked into the code I saw that the
Action
> > > simply queries the db, and puts the result in the session param (am I
> > > right, or is there any encoding modification here?). Thus, the results
are
> > > good, and I see my text as intended.
> >
> > So, the problem is not in the DB and JDBC driver, is it?
> >
> > >
> > > However, if the data is provided from the user, then the data gets
> > > corrupted (reencoded?? where in c2?) - It is the same corrupted data
for
> > > the database, the resulting html and the log file. So, I guess that
the
> > > problem is in the translation of the data. Since Tomcat does not
translate
> > > the data, then (as you said) the problem is with the C2 translation.
> >
> > Do you use actions to process the user input? Did you try to use
> > java.net.URLDecoder.decode() before using params?
> >
> > >
> > > 1. Where is this translation takes place? (which file in the source)
> >
> > That depends on your pipeline. Maybe there is no translation at all.
What is
> > your pipeline looks like?
> >
> > > 1.1 Why do we have this translation?
> >
> > As far as I remember, this happens, because some servers use 8 byte
encoded
> > HTTP requests and do not correctly interpret Unicode streams. So,
browsers
> > URL-encode all characters above 128 ASCII code into %CC form. The same
thing
> > happens when web server sends the response. Something like that, but I'm
not
> > sure that this all is correct information. See Tomcat documentation and
> > Servlet specification more info.
> >
> > > 1.2 C2 pipe model cannotuse encoding other than UTF8 ?
> >
> > I think that it's possible, because either Xerces or Xalan are able to
> > process documents in different encodings. But I've never tried it, so I
> > can't help you in this point.
> >
> > > 1.3 If so, how can I handle data that came FROM the database, and put
it
> > > into the session argument, the html and the log file??
> >
> > As you said above you don't have problems with it now. Or I get you
wrong?
> >
> > > 2. How did you solve your problem with Oracle? How did you insert
proper
> > > data to the database? (actualy, the problem is not with the oracle or
the
> > > driver but in C2 - I guess that this should be posted as a bug, no?)
> >
> > The problem was with C1 (not C2) and after that we've changed Oracle DB
> > encoding to UTF-8 then everything worked as excpeted. But that was about
a
> > year before now...
> >
> > > 3. Is there a way I can bypass the translation and give it directly to
the
> > > DB action? This way, I won't need to make major changes in C2, and
> > > everybody will be happy... :-)
> >
> > I think that you should try to find out where the data is changed.
Anyway,
> > try to decode the parameter.
> >
> > I don't think that I can provide much help, because I was away from C2
for a
> > long time and seems that I forgot many details. I'll try to provide an
i18n
> > sample with form data input and simple processing and then I'll be able
to
> > give you more definite answers.
> >
> > Konstantin
> >
> > >
> > > Thanks,
> > > Udi.
> > >
> > >
> > > On Thu, 27 Sep 2001, Piroumian, Konstantin wrote:
> > >
> > > > Hi!
> > > > See below...
> > > >
> > > > >
> > > > > C2, WNT (hebrew enabled), Tomcat3.2.3, MySQL, IE5.5
> > > > >
> > > > > Hey!
> > > > >
> > > > > I'm trying to write an application that uses hebrew in forms
(meaning
> > that
> > > > > the user can insert hebrew chars into form elements, mainly input
> > boxes).
> > > > > I guess that the problem is the same in any language which is
encoded
> > into
> > > > > special html chars.
> > > > >
> > > > > I ran a simple application in tomcat (as a simple servlet) and in
> > cocoon2,
> > > > > which simply takes the data you entered in an input box (in
hebrew),
> > place
> > > > >it into a request parameter and then displays the request parameter
> > from a
> > > > > different page.
> > > > >
> > > > > In the post message, I saw that explorer is coding my chars
correct:
> > > > > POST .....
> > > > > host: ...
> > > > >
> > > > > UserName: %E0%D9%E3....
> > > > >
> > > > > When I ran the application on a Tomcat servlet - the results were
> > good. I
> > > > > saw the exact (hebrew) chars that I've written before.
> > > > >
> > > > > On the C2, however, the parameter did not show up correctly, and
was
> > coded
> > > > > differently.
> > > >
> > > > Maybe you should try to configure your serializer to use the correct
> > > > encoding?
> > > > <map:serialize>
> > > > <encoding>[HEBREW_ENCODING_NAME]</encoding>
> > > > </map:serialize>
> > > >
> > > > >
> > > > > The problem is greater when I try to insert data into MySQL db
(which
> > > > > expects the normal %XX encoding) and get garbage there as well.
> > > >
> > > > Is it the same garbage that you see on the screen? We had similar
> > problems
> > > > with JDBC drivers and Oracle.
> > > >
> > > > >
> > > > >
> > > > > Did anyone use C2 in an html-encoded language? Can you tell me
what I
> > need
> > > > > to do to make it work?
> > > > > Why is Tomcat working and C2 not? Where is the translation being
> > > > > preformed?
> > > >
> > > > I think, that this happens because C2 uses Unicode (UTF-8) encoding
for
> > all
> > > > internal transformations and Tomcat operated with bytes and does not
> > perform
> > > > extra encodings needed in C2.
> > > >
> > > > >
> > > > > Thanks,
> > > > > Udi.


---------------------------------------------------------------------
Please check that your question has not already been answered in the
FAQ before posting. <http://xml.apache.org/cocoon/faqs.html>

To unsubscribe, e-mail: <co...@xml.apache.org>
For additional commands, e-mail: <co...@xml.apache.org>


Re: Language Support in Request Parameters

Posted by Udi Weinsberg <ud...@tochna.technion.ac.il>.
Digging a bit into the code, I found the following piece (which is copied
from Tomcat, and altered a bit). It's from
src\org\apache\cocoon\environment\wrapper\RequestParameters.java

/**
     * Decode the string
     */
    private String parseName(String s) {
        StringBuffer sb = new StringBuffer();
        for (int i = 0; i < s.length(); i++) {
            char c = s.charAt(i);
            switch (c) {
                case '+':
                    sb.append(' ');
                    break;
                case '%':
                    try {
                        sb.append((char) Integer.parseInt(s.substring(i+1,
i+3),
                              16));
                        i += 2;
                    } catch (NumberFormatException e) {
                        throw new IllegalArgumentException();
                    } catch (StringIndexOutOfBoundsException e) {
                        String rest  = s.substring(i);
                        sb.append(rest);
                        if (rest.length()==2)
                            i++;
                    }

                    break;
                default:
                    sb.append(c);
                    break;
            }
        }
        return sb.toString();
    }


This basically means, that if it finds a %CC encoded char, it simply
translates the CC into it's CHAR equivalent, and appends it to the
resulting string. Isn't this right?? It seeems perfectly right, since the
DB works with the exact same hex values, and the only way to pass chars is
using their BYTE value (0-255). I really don't understand what I am
missing here.

I am using the DBAddAction to add data to the database.  My pipe line is
quite simple and looks like:
AddPatient.xsp (has form) -> DBAddAction (using the request arguments) ->
ShowPatient.xsp (show the details by querying from the DB).

The problem is not in the DB or JDBC driver since I am able to retrieve
data from the db and insert it to a param argument. Perhaps I'll try to
see how this variable is set (as a simple string, in which base?).

An example (regardless of your i18) will be great. Btw, did you documented
the 'iw <-> he' problem I sent you?

If you are listed in the cocoon-dev list, perhaps you can post the problem
there. Let me know!


Thanks,
Udi.


On Fri, 28 Sep 2001, Piroumian, Konstantin wrote:

>  Hi!
> >
> > You said you had the same problem in Oracle. I'm using MySQL with the MM
> > JDBC driver, which expects the normal chars encoding (%xx). Even if I use
> > the serializer encoding (which I am about to try now), it will not do the
> > trick, it will only try to overcome an inherent problem (if it will work
> > at all):
> >
> > The serializer is the output's last pipe's stage, and it seems that the
> > problem is somewhere in the input pipe (meaning from the user to the
> > server and within the server).I can even see that the Log file has
> > incorrect data.
> >
> > An interesting point is that when I retrieve data from the database into a
> > session argument (using the DBAuthAction), the data is inserted correctly
> > - meaning that I see it proper in theLog file and in the resulting page
> > (both in plain hebrew). As I looked into the code I saw that the Action
> > simply queries the db, and puts the result in the session param (am I
> > right, or is there any encoding modification here?). Thus, the results are
> > good, and I see my text as intended.
>
> So, the problem is not in the DB and JDBC driver, is it?
>
> >
> > However, if the data is provided from the user, then the data gets
> > corrupted (reencoded?? where in c2?) - It is the same corrupted data for
> > the database, the resulting html and the log file. So, I guess that the
> > problem is in the translation of the data. Since Tomcat does not translate
> > the data, then (as you said) the problem is with the C2 translation.
>
> Do you use actions to process the user input? Did you try to use
> java.net.URLDecoder.decode() before using params?
>
> >
> > 1. Where is this translation takes place? (which file in the source)
>
> That depends on your pipeline. Maybe there is no translation at all. What is
> your pipeline looks like?
>
> > 1.1 Why do we have this translation?
>
> As far as I remember, this happens, because some servers use 8 byte encoded
> HTTP requests and do not correctly interpret Unicode streams. So, browsers
> URL-encode all characters above 128 ASCII code into %CC form. The same thing
> happens when web server sends the response. Something like that, but I'm not
> sure that this all is correct information. See Tomcat documentation and
> Servlet specification more info.
>
> > 1.2 C2 pipe model cannotuse encoding other than UTF8 ?
>
> I think that it's possible, because either Xerces or Xalan are able to
> process documents in different encodings. But I've never tried it, so I
> can't help you in this point.
>
> > 1.3 If so, how can I handle data that came FROM the database, and put it
> > into the session argument, the html and the log file??
>
> As you said above you don't have problems with it now. Or I get you wrong?
>
> > 2. How did you solve your problem with Oracle? How did you insert proper
> > data to the database? (actualy, the problem is not with the oracle or the
> > driver but in C2 - I guess that this should be posted as a bug, no?)
>
> The problem was with C1 (not C2) and after that we've changed Oracle DB
> encoding to UTF-8 then everything worked as excpeted. But that was about a
> year before now...
>
> > 3. Is there a way I can bypass the translation and give it directly to the
> > DB action? This way, I won't need to make major changes in C2, and
> > everybody will be happy... :-)
>
> I think that you should try to find out where the data is changed. Anyway,
> try to decode the parameter.
>
> I don't think that I can provide much help, because I was away from C2 for a
> long time and seems that I forgot many details. I'll try to provide an i18n
> sample with form data input and simple processing and then I'll be able to
> give you more definite answers.
>
> Konstantin
>
> >
> > Thanks,
> > Udi.
> >
> >
> > On Thu, 27 Sep 2001, Piroumian, Konstantin wrote:
> >
> > > Hi!
> > > See below...
> > >
> > > >
> > > > C2, WNT (hebrew enabled), Tomcat3.2.3, MySQL, IE5.5
> > > >
> > > > Hey!
> > > >
> > > > I'm trying to write an application that uses hebrew in forms (meaning
> that
> > > > the user can insert hebrew chars into form elements, mainly input
> boxes).
> > > > I guess that the problem is the same in any language which is encoded
> into
> > > > special html chars.
> > > >
> > > > I ran a simple application in tomcat (as a simple servlet) and in
> cocoon2,
> > > > which simply takes the data you entered in an input box (in hebrew),
> place
> > > >it into a request parameter and then displays the request parameter
> from a
> > > > different page.
> > > >
> > > > In the post message, I saw that explorer is coding my chars correct:
> > > > POST .....
> > > > host: ...
> > > >
> > > > UserName: %E0%D9%E3....
> > > >
> > > > When I ran the application on a Tomcat servlet - the results were
> good. I
> > > > saw the exact (hebrew) chars that I've written before.
> > > >
> > > > On the C2, however, the parameter did not show up correctly, and was
> coded
> > > > differently.
> > >
> > > Maybe you should try to configure your serializer to use the correct
> > > encoding?
> > > <map:serialize>
> > > <encoding>[HEBREW_ENCODING_NAME]</encoding>
> > > </map:serialize>
> > >
> > > >
> > > > The problem is greater when I try to insert data into MySQL db (which
> > > > expects the normal %XX encoding) and get garbage there as well.
> > >
> > > Is it the same garbage that you see on the screen? We had similar
> problems
> > > with JDBC drivers and Oracle.
> > >
> > > >
> > > >
> > > > Did anyone use C2 in an html-encoded language? Can you tell me what I
> need
> > > > to do to make it work?
> > > > Why is Tomcat working and C2 not? Where is the translation being
> > > > preformed?
> > >
> > > I think, that this happens because C2 uses Unicode (UTF-8) encoding for
> all
> > > internal transformations and Tomcat operated with bytes and does not
> perform
> > > extra encodings needed in C2.
> > >
> > > >
> > > > Thanks,
> > > > Udi.
> > > >
> > > >
> > > >
> > > > ---------------------------------------------------------------------
> > > > Please check that your question has not already been answered in the
> > > > FAQ before posting. <http://xml.apache.org/cocoon/faqs.html>
> > > >
> > > > To unsubscribe, e-mail: <co...@xml.apache.org>
> > > > For additional commands, e-mail: <co...@xml.apache.org>
> > > >
> > >
> > > ---------------------------------------------------------------------
> > > Please check that your question has not already been answered in the
> > > FAQ before posting. <http://xml.apache.org/cocoon/faqs.html>
> > >
> > > To unsubscribe, e-mail: <co...@xml.apache.org>
> > > For additional commands, e-mail: <co...@xml.apache.org>
> > >
> >
> >
> > ---------------------------------------------------------------------
> > Please check that your question has not already been answered in the
> > FAQ before posting. <http://xml.apache.org/cocoon/faqs.html>
> >
> > To unsubscribe, e-mail: <co...@xml.apache.org>
> > For additional commands, e-mail: <co...@xml.apache.org>
> >
>
> ---------------------------------------------------------------------
> Please check that your question has not already been answered in the
> FAQ before posting. <http://xml.apache.org/cocoon/faqs.html>
>
> To unsubscribe, e-mail: <co...@xml.apache.org>
> For additional commands, e-mail: <co...@xml.apache.org>
>


---------------------------------------------------------------------
Please check that your question has not already been answered in the
FAQ before posting. <http://xml.apache.org/cocoon/faqs.html>

To unsubscribe, e-mail: <co...@xml.apache.org>
For additional commands, e-mail: <co...@xml.apache.org>


Re: Language Support in Request Parameters

Posted by "Piroumian, Konstantin" <KP...@flagship.ru>.
 Hi!
>
> You said you had the same problem in Oracle. I'm using MySQL with the MM
> JDBC driver, which expects the normal chars encoding (%xx). Even if I use
> the serializer encoding (which I am about to try now), it will not do the
> trick, it will only try to overcome an inherent problem (if it will work
> at all):
>
> The serializer is the output's last pipe's stage, and it seems that the
> problem is somewhere in the input pipe (meaning from the user to the
> server and within the server). I can even see that the Log file has
> incorrect data.
>
> An interesting point is that when I retrieve data from the database into a
> session argument (using the DBAuthAction), the data is inserted correctly
> - meaning that I see it proper in the Log file and in the resulting page
> (both in plain hebrew). As I looked into the code I saw that the Action
> simply queries the db, and puts the result in the session param (am I
> right, or is there any encoding modification here?). Thus, the results are
> good, and I see my text as intended.

So, the problem is not in the DB and JDBC driver, is it?

>
> However, if the data is provided from the user, then the data gets
> corrupted (reencoded?? where in c2?) - It is the same corrupted data for
> the database, the resulting html and the log file. So, I guess that the
> problem is in the translation of the data. Since Tomcat does not translate
> the data, then (as you said) the problem is with the C2 translation.

Do you use actions to process the user input? Did you try to use
java.net.URLDecoder.decode() before using params?

>
> 1. Where is this translation takes place? (which file in the source)

That depends on your pipeline. Maybe there is no translation at all. What is
your pipeline looks like?

> 1.1 Why do we have this translation?

As far as I remember, this happens, because some servers use 8 byte encoded
HTTP requests and do not correctly interpret Unicode streams. So, browsers
URL-encode all characters above 128 ASCII code into %CC form. The same thing
happens when web server sends the response. Something like that, but I'm not
sure that this all is correct information. See Tomcat documentation and
Servlet specification more info.

> 1.2 C2 pipe model cannot use encoding other than UTF8 ?

I think that it's possible, because either Xerces or Xalan are able to
process documents in different encodings. But I've never tried it, so I
can't help you in this point.

> 1.3 If so, how can I handle data that came FROM the database, and put it
> into the session argument, the html and the log file??

As you said above you don't have problems with it now. Or I get you wrong?

> 2. How did you solve your problem with Oracle? How did you insert proper
> data to the database? (actualy, the problem is not with the oracle or the
> driver but in C2 - I guess that this should be posted as a bug, no?)

The problem was with C1 (not C2) and after that we've changed Oracle DB
encoding to UTF-8 then everything worked as excpeted. But that was about a
year before now...

> 3. Is there a way I can bypass the translation and give it directly to the
> DB action? This way, I won't need to make major changes in C2, and
> everybody will be happy... :-)

I think that you should try to find out where the data is changed. Anyway,
try to decode the parameter.

I don't think that I can provide much help, because I was away from C2 for a
long time and seems that I forgot many details. I'll try to provide an i18n
sample with form data input and simple processing and then I'll be able to
give you more definite answers.

Konstantin

>
> Thanks,
> Udi.
>
>
> On Thu, 27 Sep 2001, Piroumian, Konstantin wrote:
>
> > Hi!
> > See below...
> >
> > >
> > > C2, WNT (hebrew enabled), Tomcat3.2.3, MySQL, IE5.5
> > >
> > > Hey!
> > >
> > > I'm trying to write an application that uses hebrew in forms (meaning
that
> > > the user can insert hebrew chars into form elements, mainly input
boxes).
> > > I guess that the problem is the same in any language which is encoded
into
> > > special html chars.
> > >
> > > I ran a simple application in tomcat (as a simple servlet) and in
cocoon2,
> > > which simply takes the data you entered in an input box (in hebrew),
place
> > >it into a request parameter and then displays the request parameter
from a
> > > different page.
> > >
> > > In the post message, I saw that explorer is coding my chars correct:
> > > POST .....
> > > host: ...
> > >
> > > UserName: %E0%D9%E3....
> > >
> > > When I ran the application on a Tomcat servlet - the results were
good. I
> > > saw the exact (hebrew) chars that I've written before.
> > >
> > > On the C2, however, the parameter did not show up correctly, and was
coded
> > > differently.
> >
> > Maybe you should try to configure your serializer to use the correct
> > encoding?
> > <map:serialize>
> >   <encoding>[HEBREW_ENCODING_NAME]</encoding>
> > </map:serialize>
> >
> > >
> > > The problem is greater when I try to insert data into MySQL db (which
> > > expects the normal %XX encoding) and get garbage there as well.
> >
> > Is it the same garbage that you see on the screen? We had similar
problems
> > with JDBC drivers and Oracle.
> >
> > >
> > >
> > > Did anyone use C2 in an html-encoded language? Can you tell me what I
need
> > > to do to make it work?
> > > Why is Tomcat working and C2 not? Where is the translation being
> > > preformed?
> >
> > I think, that this happens because C2 uses Unicode (UTF-8) encoding for
all
> > internal transformations and Tomcat operated with bytes and does not
perform
> > extra encodings needed in C2.
> >
> > >
> > > Thanks,
> > > Udi.
> > >
> > >
> > >
> > > ---------------------------------------------------------------------
> > > Please check that your question has not already been answered in the
> > > FAQ before posting. <http://xml.apache.org/cocoon/faqs.html>
> > >
> > > To unsubscribe, e-mail: <co...@xml.apache.org>
> > > For additional commands, e-mail: <co...@xml.apache.org>
> > >
> >
> > ---------------------------------------------------------------------
> > Please check that your question has not already been answered in the
> > FAQ before posting. <http://xml.apache.org/cocoon/faqs.html>
> >
> > To unsubscribe, e-mail: <co...@xml.apache.org>
> > For additional commands, e-mail: <co...@xml.apache.org>
> >
>
>
> ---------------------------------------------------------------------
> Please check that your question has not already been answered in the
> FAQ before posting. <http://xml.apache.org/cocoon/faqs.html>
>
> To unsubscribe, e-mail: <co...@xml.apache.org>
> For additional commands, e-mail: <co...@xml.apache.org>
>

---------------------------------------------------------------------
Please check that your question has not already been answered in the
FAQ before posting. <http://xml.apache.org/cocoon/faqs.html>

To unsubscribe, e-mail: <co...@xml.apache.org>
For additional commands, e-mail: <co...@xml.apache.org>


Re: Language Support in Request Parameters

Posted by Udi Weinsberg <ud...@tochna.technion.ac.il>.
Hi!

You said you had the same problem in Oracle. I'm using MySQL with the MM
JDBC driver, which expects the normal chars encoding (%xx). Even if I use
the serializer encoding (which I am about to try now), it will not do the
trick, it will only try to overcome an inherent problem (if it will work
at all):

The serializer is the output's last pipe's stage, and it seems that the
problem is somewhere in the input pipe (meaning from the user to the
server and within the server). I can even see that the Log file has
incorrect data.

An interesting point is that when I retrieve data from the database into a
session argument (using the DBAuthAction), the data is inserted correctly
- meaning that I see it proper in the Log file and in the resulting page
(both in plain hebrew). As I looked into the code I saw that the Action
simply queries the db, and puts the result in the session param (am I
right, or is there any encoding modification here?). Thus, the results are
good, and I see my text as intended.

However, if the data is provided from the user, then the data gets
corrupted (reencoded?? where in c2?) - It is the same corrupted data for
the database, the resulting html and the log file. So, I guess that the
problem is in the translation of the data. Since Tomcat does not translate
the data, then (as you said) the problem is with the C2 translation.

1. Where is this translation takes place? (which file in the source)
1.1 Why do we have this translation?
1.2 C2 pipe model cannot use encoding other than UTF8 ?
1.3 If so, how can I handle data that came FROM the database, and put it
into the session argument, the html and the log file??
2. How did you solve your problem with Oracle? How did you insert proper
data to the database? (actualy, the problem is not with the oracle or the
driver but in C2 - I guess that this should be posted as a bug, no?)
3. Is there a way I can bypass the translation and give it directly to the
DB action? This way, I won't need to make major changes in C2, and
everybody will be happy... :-)

Thanks,
Udi.


On Thu, 27 Sep 2001, Piroumian, Konstantin wrote:

> Hi!
> See below...
>
> >
> > C2, WNT (hebrew enabled), Tomcat3.2.3, MySQL, IE5.5
> >
> > Hey!
> >
> > I'm trying to write an application that uses hebrew in forms (meaning that
> > the user can insert hebrew chars into form elements, mainly input boxes).
> > I guess that the problem is the same in any language which is encoded into
> > special html chars.
> >
> > I ran a simple application in tomcat (as a simple servlet) and in cocoon2,
> > which simply takes the data you entered in an input box (in hebrew), place
> >it into a request parameter and then displays the request parameter from a
> > different page.
> >
> > In the post message, I saw that explorer is coding my chars correct:
> > POST .....
> > host: ...
> >
> > UserName: %E0%D9%E3....
> >
> > When I ran the application on a Tomcat servlet - the results were good. I
> > saw the exact (hebrew) chars that I've written before.
> >
> > On the C2, however, the parameter did not show up correctly, and was coded
> > differently.
>
> Maybe you should try to configure your serializer to use the correct
> encoding?
> <map:serialize>
>   <encoding>[HEBREW_ENCODING_NAME]</encoding>
> </map:serialize>
>
> >
> > The problem is greater when I try to insert data into MySQL db (which
> > expects the normal %XX encoding) and get garbage there as well.
>
> Is it the same garbage that you see on the screen? We had similar problems
> with JDBC drivers and Oracle.
>
> >
> >
> > Did anyone use C2 in an html-encoded language? Can you tell me what I need
> > to do to make it work?
> > Why is Tomcat working and C2 not? Where is the translation being
> > preformed?
>
> I think, that this happens because C2 uses Unicode (UTF-8) encoding for all
> internal transformations and Tomcat operated with bytes and does not perform
> extra encodings needed in C2.
>
> >
> > Thanks,
> > Udi.
> >
> >
> >
> > ---------------------------------------------------------------------
> > Please check that your question has not already been answered in the
> > FAQ before posting. <http://xml.apache.org/cocoon/faqs.html>
> >
> > To unsubscribe, e-mail: <co...@xml.apache.org>
> > For additional commands, e-mail: <co...@xml.apache.org>
> >
>
> ---------------------------------------------------------------------
> Please check that your question has not already been answered in the
> FAQ before posting. <http://xml.apache.org/cocoon/faqs.html>
>
> To unsubscribe, e-mail: <co...@xml.apache.org>
> For additional commands, e-mail: <co...@xml.apache.org>
>


---------------------------------------------------------------------
Please check that your question has not already been answered in the
FAQ before posting. <http://xml.apache.org/cocoon/faqs.html>

To unsubscribe, e-mail: <co...@xml.apache.org>
For additional commands, e-mail: <co...@xml.apache.org>


Re: Language Support in Request Parameters

Posted by "Piroumian, Konstantin" <KP...@flagship.ru>.
Hi!
See below...

>
> C2, WNT (hebrew enabled), Tomcat3.2.3, MySQL, IE5.5
>
> Hey!
>
> I'm trying to write an application that uses hebrew in forms (meaning that
> the user can insert hebrew chars into form elements, mainly input boxes).
> I guess that the problem is the same in any language which is encoded into
> special html chars.
>
> I ran a simple application in tomcat (as a simple servlet) and in cocoon2,
> which simply takes the data you entered in an input box (in hebrew), place
> it into a request parameter and then displays the request parameter from a
> different page.
>
> In the post message, I saw that explorer is coding my chars correct:
> POST .....
> host: ...
>
> UserName: %E0%D9%E3....
>
> When I ran the application on a Tomcat servlet - the results were good. I
> saw the exact (hebrew) chars that I've written before.
>
> On the C2, however, the parameter did not show up correctly, and was coded
> differently.

Maybe you should try to configure your serializer to use the correct
encoding?
<map:serialize>
    <encoding>[HEBREW_ENCODING_NAME]</encoding>
</map:serialize>

>
> The problem is greater when I try to insert data into MySQL db (which
> expects the normal %XX encoding) and get garbage there as well.

Is it the same garbage that you see on the screen? We had similar problems
with JDBC drivers and Oracle.

>
>
> Did anyone use C2 in an html-encoded language? Can you tell me what I need
> to do to make it work?
> Why is Tomcat working and C2 not? Where is the translation being
> preformed?

I think, that this happens because C2 uses Unicode (UTF-8) encoding for all
internal transformations and Tomcat operated with bytes and does not perform
extra encodings needed in C2.

>
> Thanks,
> Udi.
>
>
>
> ---------------------------------------------------------------------
> Please check that your question has not already been answered in the
> FAQ before posting. <http://xml.apache.org/cocoon/faqs.html>
>
> To unsubscribe, e-mail: <co...@xml.apache.org>
> For additional commands, e-mail: <co...@xml.apache.org>
>

---------------------------------------------------------------------
Please check that your question has not already been answered in the
FAQ before posting. <http://xml.apache.org/cocoon/faqs.html>

To unsubscribe, e-mail: <co...@xml.apache.org>
For additional commands, e-mail: <co...@xml.apache.org>