You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@cocoon.apache.org by Pier Fumagalli <pi...@betaversion.org> on 2005/09/05 01:08:06 UTC

Extremely odd source in XSPs (breaks build)

/Users/pier/Workspace/cocoon/src/blocks/xsp/java/org/apache/cocoon/ 
components/language/markup/xsp/XSPExpressionParser.java:214: unclosed  
character literal
                 case '�':
                      ^
/Users/pier/Workspace/cocoon/src/blocks/xsp/java/org/apache/cocoon/ 
components/language/markup/xsp/XSPExpressionParser.java:214: unclosed  
character literal
                 case '�':
                          ^
/Users/pier/Workspace/cocoon/src/blocks/xsp/java/org/apache/cocoon/ 
components/language/markup/xsp/XSPExpressionParser.java:215: : expected
                     parser.append(ch);
                                      ^
/Users/pier/Workspace/cocoon/src/blocks/xsp/java/org/apache/cocoon/ 
components/language/markup/xsp/XSPExpressionParser.java:241: unclosed  
character literal
     protected static final State EXPRESSION_SHELL_STATE = new  
QuotedState('�');
                                                                         
    ^
/Users/pier/Workspace/cocoon/src/blocks/xsp/java/org/apache/cocoon/ 
components/language/markup/xsp/XSPExpressionParser.java:241: unclosed  
character literal
     protected static final State EXPRESSION_SHELL_STATE = new  
QuotedState('�');
                                                                         
        ^
/Users/pier/Workspace/cocoon/src/blocks/xsp/java/org/apache/cocoon/ 
components/language/markup/xsp/XSPExpressionParser.java:241: ')'  
expected
     protected static final State EXPRESSION_SHELL_STATE = new  
QuotedState('�');

It's probably because I use a mac... but that character to me looks  
like a diamond with a question-mark in the middle. I tend to use  
UTF-8 everywhere, can whoever committed that code, change the  
character to a valid \u???? character? I personally can't do a full  
build of Cocoon 2.1.x right now unless I exclude XSP (and even when  
doing so, JAVADOC breaks).

Thanks!

     Pier

Re: Extremely odd source in XSPs (breaks build)

Posted by Sylvain Wallez <sy...@apache.org>.

Pier Fumagalli wrote:

> FLOG ME!!!!!!!!!!!!!!!! I AM A FUCKING MORON!


ROTFL :-)

Sylvain, who loves Pier's language.

-- 
Sylvain Wallez                        Anyware Technologies
http://people.apache.org/~sylvain     http://www.anyware-tech.com
Apache Software Foundation Member     Research & Technology Director

Re: Extremely odd source in XSPs (breaks build)

Posted by Pier Fumagalli <pi...@betaversion.org>.

On 5 Sep 2005, at 00:08, Pier Fumagalli wrote:

> /Users/pier/Workspace/cocoon/src/blocks/xsp/java/org/apache/cocoon/ 
> components/language/markup/xsp/XSPExpressionParser.java:214:  
> unclosed character literal
>                 case '�':
>
> It's probably because I use a mac... but that character to me looks  
> like a diamond with a question-mark in the middle. I tend to use  
> UTF-8 everywhere, can whoever committed that code, change the  
> character to a valid \u???? character? I personally can't do a full  
> build of Cocoon 2.1.x right now unless I exclude XSP (and even when  
> doing so, JAVADOC breaks).
>
> Thanks!

FLOG ME!!!!!!!!!!!!!!!! I AM A FUCKING MORON!

Something went wild in my Eclipse... That character should be '´'...  
My bad at revision 278591.

     Pier

Re: Extremely odd source in XSPs (breaks build)

Posted by Bertrand Delacretaz <bd...@apache.org>.

Le 5 sept. 05, à 01:34, Pier Fumagalli a écrit :
> ...We should make sure that whenever code is written, any character 
> outside the "Basic-Latin" Unicode spec 
> <http://www.unicode.org/charts/PDF/U0000.pdf> is correctly encoded as 
> \uXXXX, otherwise things are going to start breaking... :-D

+1, but I don't know an easy way to enforce this. If someone has a 
lightweight tool or compiler switch to check that it would be good.

-Bertrand

Re: Extremely odd source in XSPs (breaks build)

Posted by Pier Fumagalli <pi...@betaversion.org>.

On 5 Sep 2005, at 02:13, Antonio Gallardo wrote:

>> Seriously talking, it depends on a number of things, not only  
>> your  platform encoding, but also on your editor's, your compiler  
>> and so  on... The good thing about XML is the <?xml version="1.0"   
>> encoding="xxxx"?> sorting out 99.9% of the problems, but that  
>> said,  the same problem can pop up here and there...
>
> Yep. I think the migration to UNICODE is almost done in the  
> industry. Actually, we observe a lot less cases of encoding  
> problems than than few years ago. I think it is a probe that the  
> industry now know better how to deal with UNICODE than ever.

To tell you the truth, ancient greeks wrote in UNICODE:

http://www.unicode.org/charts/PDF/U0370.pdf
http://www.unicode.org/charts/PDF/U1F00.pdf

but even better:

http://www.unicode.org/charts/PDF/U10140.pdf
http://www.unicode.org/charts/PDF/U1D200.pdf

For for what it's worth, Klingons, through a sub-standard called  
"ConScript" write UNICODE:

http://www.evertype.com/standards/csur/klingon.html

People, let's _NOT_ confuse UNICODE as the reference table of all  
characters humanely comprehensible, and the ENCODINGS as their  
representation as a sequence of bytes.

     Pier

Re: Extremely odd source in XSPs (breaks build)

Posted by Antonio Gallardo <ag...@agssa.net>.

Pier Fumagalli wrote:

> On 5 Sep 2005, at 01:23, Antonio Gallardo wrote:
>
>>>
>>> I'm seeing LOTSA characters from Latin-1 in the source code  
>>> <http:// www.unicode.org/charts/PDF/U0080.pdf> (especially in  
>>> people's names!)
>>
>>
>> You wanna see name? Then use "svn log [src-path]" or "svn blame  
>> [src-path]" ;-)
>
>
> Not interested in who to "blame" :-) The "name" reference was more  
> into things like name of people with german umlauts :-) :-)
>
>> For sure, it is not me, I use UTF-8:
>>
>> $locale
>> LANG=es_NI.UTF-8
>
>
> Then you've never used the SHELL_ESCAPE in XSPs for using Python  
> bindings (or whatever that is) :-)

Never! IMO, this can be a very dangerus programming technique! [0] :-D

>
> Seriously talking, it depends on a number of things, not only your  
> platform encoding, but also on your editor's, your compiler and so  
> on... The good thing about XML is the <?xml version="1.0"  
> encoding="xxxx"?> sorting out 99.9% of the problems, but that said,  
> the same problem can pop up here and there...

Yep. I think the migration to UNICODE is almost done in the industry. 
Actually, we observe a lot less cases of encoding problems than than few 
years ago. I think it is a probe that the industry now know better how 
to deal with UNICODE than ever.

>
>>> We should make sure that whenever code is written, any character   
>>> outside the "Basic-Latin" Unicode spec <http://www.unicode.org/ 
>>> charts/ PDF/U0000.pdf> is correctly encoded as \uXXXX, otherwise  
>>> things are  going to start breaking... :-D
>>
>>
>> +1 and this is why I sent my first mail related to this issue:
>>
>> http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=112586325420497
>
>
> Anyone volunteers to write a little script looking for BYTEs (not  
> characters) in our source code that are outside of the 00-7F (0-127)  
> range?

One case per year justify this? ;-)

Best Regards,

Antonio Gallardo

[0] - http://www.manbir-online.com/snakes/python.htm

Re: Extremely odd source in XSPs (breaks build)

Posted by Pier Fumagalli <pi...@betaversion.org>.

On 5 Sep 2005, at 01:23, Antonio Gallardo wrote:
>>
>> I'm seeing LOTSA characters from Latin-1 in the source code  
>> <http:// www.unicode.org/charts/PDF/U0080.pdf> (especially in  
>> people's names!)
>
> You wanna see name? Then use "svn log [src-path]" or "svn blame  
> [src-path]" ;-)

Not interested in who to "blame" :-) The "name" reference was more  
into things like name of people with german umlauts :-) :-)

> For sure, it is not me, I use UTF-8:
>
> $locale
> LANG=es_NI.UTF-8

Then you've never used the SHELL_ESCAPE in XSPs for using Python  
bindings (or whatever that is) :-)

Seriously talking, it depends on a number of things, not only your  
platform encoding, but also on your editor's, your compiler and so  
on... The good thing about XML is the <?xml version="1.0"  
encoding="xxxx"?> sorting out 99.9% of the problems, but that said,  
the same problem can pop up here and there...

>> We should make sure that whenever code is written, any character   
>> outside the "Basic-Latin" Unicode spec <http://www.unicode.org/ 
>> charts/ PDF/U0000.pdf> is correctly encoded as \uXXXX, otherwise  
>> things are  going to start breaking... :-D
>
> +1 and this is why I sent my first mail related to this issue:
>
> http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=112586325420497

Anyone volunteers to write a little script looking for BYTEs (not  
characters) in our source code that are outside of the 00-7F (0-127)  
range?

     Pier

Re: Extremely odd source in XSPs (breaks build)

Posted by Antonio Gallardo <ag...@agssa.net>.

Pier Fumagalli wrote:

> On 5 Sep 2005, at 00:15, Antonio Gallardo wrote:
>
>> Hi Pier,
>>
>> Please read: http://marc.theaimsgroup.com/?l=xml-cocoon- 
>> dev&m=112586325420497 ;-)
>
>
> Yep, that brings to a point... Who in the world did put that  
> character there in the very first place? I mean, not everyone uses  
> ISO8859-1 as its encoding, more and more people are moving to UTF-8  
> nowadays (read -me- :-) :-)
>
> I'm seeing LOTSA characters from Latin-1 in the source code <http:// 
> www.unicode.org/charts/PDF/U0080.pdf> (especially in people's names!)

You wanna see name? Then use "svn log [src-path]" or "svn blame 
[src-path]" ;-)

For sure, it is not me, I use UTF-8:

$locale
LANG=es_NI.UTF-8
LC_CTYPE="es_NI.UTF-8"
LC_NUMERIC="es_NI.UTF-8"
LC_TIME="es_NI.UTF-8"
LC_COLLATE="es_NI.UTF-8"
LC_MONETARY="es_NI.UTF-8"
LC_MESSAGES="es_NI.UTF-8"
LC_PAPER="es_NI.UTF-8"
LC_NAME="es_NI.UTF-8"
LC_ADDRESS="es_NI.UTF-8"
LC_TELEPHONE="es_NI.UTF-8"
LC_MEASUREMENT="es_NI.UTF-8"
LC_IDENTIFICATION="es_NI.UTF-8"
LC_ALL=

>
> We should make sure that whenever code is written, any character  
> outside the "Basic-Latin" Unicode spec <http://www.unicode.org/charts/ 
> PDF/U0000.pdf> is correctly encoded as \uXXXX, otherwise things are  
> going to start breaking... :-D


+1 and this is why I sent my first mail related to this issue:

http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=112586325420497

Best Regards,

Antonio Gallardo

Re: Extremely odd source in XSPs (breaks build)

Posted by Pier Fumagalli <pi...@betaversion.org>.

On 5 Sep 2005, at 00:15, Antonio Gallardo wrote:

> Hi Pier,
>
> Please read: http://marc.theaimsgroup.com/?l=xml-cocoon- 
> dev&m=112586325420497 ;-)

Yep, that brings to a point... Who in the world did put that  
character there in the very first place? I mean, not everyone uses  
ISO8859-1 as its encoding, more and more people are moving to UTF-8  
nowadays (read -me- :-) :-)

I'm seeing LOTSA characters from Latin-1 in the source code <http:// 
www.unicode.org/charts/PDF/U0080.pdf> (especially in people's names!)

We should make sure that whenever code is written, any character  
outside the "Basic-Latin" Unicode spec <http://www.unicode.org/charts/ 
PDF/U0000.pdf> is correctly encoded as \uXXXX, otherwise things are  
going to start breaking... :-D

     Pier

Re: Extremely odd source in XSPs (breaks build)

Posted by Antonio Gallardo <ag...@agssa.net>.

Hi Pier,

Please read: 
http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=112586325420497 ;-)

Best Regards,

Antonio Gallardo.

Pier Fumagalli wrote:

> /Users/pier/Workspace/cocoon/src/blocks/xsp/java/org/apache/cocoon/ 
> components/language/markup/xsp/XSPExpressionParser.java:214: unclosed  
> character literal
>                 case '�':
>                      ^
> /Users/pier/Workspace/cocoon/src/blocks/xsp/java/org/apache/cocoon/ 
> components/language/markup/xsp/XSPExpressionParser.java:214: unclosed  
> character literal
>                 case '�':
>                          ^
> /Users/pier/Workspace/cocoon/src/blocks/xsp/java/org/apache/cocoon/ 
> components/language/markup/xsp/XSPExpressionParser.java:215: : expected
>                     parser.append(ch);
>                                      ^
> /Users/pier/Workspace/cocoon/src/blocks/xsp/java/org/apache/cocoon/ 
> components/language/markup/xsp/XSPExpressionParser.java:241: unclosed  
> character literal
>     protected static final State EXPRESSION_SHELL_STATE = new  
> QuotedState('�');
>                                                                         
>    ^
> /Users/pier/Workspace/cocoon/src/blocks/xsp/java/org/apache/cocoon/ 
> components/language/markup/xsp/XSPExpressionParser.java:241: unclosed  
> character literal
>     protected static final State EXPRESSION_SHELL_STATE = new  
> QuotedState('�');
>                                                                         
>        ^
> /Users/pier/Workspace/cocoon/src/blocks/xsp/java/org/apache/cocoon/ 
> components/language/markup/xsp/XSPExpressionParser.java:241: ')'  
> expected
>     protected static final State EXPRESSION_SHELL_STATE = new  
> QuotedState('�');
>
> It's probably because I use a mac... but that character to me looks  
> like a diamond with a question-mark in the middle. I tend to use  
> UTF-8 everywhere, can whoever committed that code, change the  
> character to a valid \u???? character? I personally can't do a full  
> build of Cocoon 2.1.x right now unless I exclude XSP (and even when  
> doing so, JAVADOC breaks).
>
> Thanks!


>     Pier
>