You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flex.apache.org by Cyrill Zadra <cy...@gmail.com> on 2012/11/16 05:22:47 UTC

[FALCON] Problem with \uFEFF chararcters in mustella files

Hi

In the mustella tests there are files that contains \uFEFF (just
before the package declaration) and in some cases falcon has problem
with this character and ends with the error:

[java] Error: Unexpected character. '?' is not allowed here [java]
?package assets.styleTest

Following command searches for non-ascii chars and if executed in
mustella folder it returns quite a big result.

grep --color='auto' -I -R -P -n "^[\x80-\xFF]" *

Are those characters really supposed to be there?

cyrill

Re: [FALCON] Problem with \uFEFF chararcters in mustella files

Posted by Cyrill Zadra <cy...@gmail.com>.
Hi Gordon

I think so too.. I'll try to create a list with all the affected files
and if there aren't too much I'll replace them manually .. otherwise
there is maybe a way to do it by script.

cyrill





Am 16.11.2012 um 11:15 schrieb Gordon Smith <go...@adobe.com>:

> I believe a BOM is only a BOM if it's at the beginning of the file. So I think Falcon is correct to complain and the files should be fixed.
>
> - Gordon
>
> -----Original Message-----
> From: Cyrill Zadra [mailto:cyrill.zadra@gmail.com]
> Sent: Friday, November 16, 2012 1:58 AM
> To: flex-dev@incubator.apache.org
> Subject: Re: [FALCON] Problem with \uFEFF chararcters in mustella files
>
> I may have found the problem. It looks like there are files where the BOM isn't at the beginning of a file and thats the case where falcon can run into problems.
>
> ActionScript Example
>
> ////////////////////////////////////////////////////////////////////////////////
> //
> //  Apache License Header
> //
> ////////////////////////////////////////////////////////////////////////////////
> BOMpackage ...
> {
>
> }
>
> cyrill
>
> On Thu, Nov 15, 2012 at 10:44 PM, Cyrill Zadra <cy...@gmail.com> wrote:
>> Thanks Alex. Just looked into falcon code and found.a few places where
>> it already ignore the BOM. So it must be a rare exception. I'll take a
>> look into that and try to find out in which exception it fails.
>>
>> cyrill
>>
>> On Thu, Nov 15, 2012 at 9:03 PM, Alex Harui <ah...@adobe.com> wrote:
>>>
>>>
>>>
>>> On 11/15/12 8:22 PM, "Cyrill Zadra" <cy...@gmail.com> wrote:
>>>
>>>> Hi
>>>>
>>>> In the mustella tests there are files that contains \uFEFF (just
>>>> before the package declaration) and in some cases falcon has problem
>>>> with this character and ends with the error:
>>>>
>>>> [java] Error: Unexpected character. '?' is not allowed here [java]
>>>> ?package assets.styleTest
>>>>
>>>> Following command searches for non-ascii chars and if executed in
>>>> mustella folder it returns quite a big result.
>>>>
>>>> grep --color='auto' -I -R -P -n "^[\x80-\xFF]" *
>>>>
>>>> Are those characters really supposed to be there?
>>>
>>> See [1].  Falcon will have to learn to ignore them.
>>>
>>> [1]
>>> http://www.rgagnon.com/javadetails/java-handle-utf8-file-with-bom.htm
>>> l
>>>
>>> --
>>> Alex Harui
>>> Flex SDK Team
>>> Adobe Systems, Inc.
>>> http://blogs.adobe.com/aharui
>>>

RE: [FALCON] Problem with \uFEFF chararcters in mustella files

Posted by Gordon Smith <go...@adobe.com>.
I believe a BOM is only a BOM if it's at the beginning of the file. So I think Falcon is correct to complain and the files should be fixed.

- Gordon

-----Original Message-----
From: Cyrill Zadra [mailto:cyrill.zadra@gmail.com] 
Sent: Friday, November 16, 2012 1:58 AM
To: flex-dev@incubator.apache.org
Subject: Re: [FALCON] Problem with \uFEFF chararcters in mustella files

I may have found the problem. It looks like there are files where the BOM isn't at the beginning of a file and thats the case where falcon can run into problems.

ActionScript Example

////////////////////////////////////////////////////////////////////////////////
//
//  Apache License Header
//
////////////////////////////////////////////////////////////////////////////////
BOMpackage ...
{

}

cyrill

On Thu, Nov 15, 2012 at 10:44 PM, Cyrill Zadra <cy...@gmail.com> wrote:
> Thanks Alex. Just looked into falcon code and found.a few places where 
> it already ignore the BOM. So it must be a rare exception. I'll take a 
> look into that and try to find out in which exception it fails.
>
> cyrill
>
> On Thu, Nov 15, 2012 at 9:03 PM, Alex Harui <ah...@adobe.com> wrote:
>>
>>
>>
>> On 11/15/12 8:22 PM, "Cyrill Zadra" <cy...@gmail.com> wrote:
>>
>>> Hi
>>>
>>> In the mustella tests there are files that contains \uFEFF (just 
>>> before the package declaration) and in some cases falcon has problem 
>>> with this character and ends with the error:
>>>
>>> [java] Error: Unexpected character. '?' is not allowed here [java] 
>>> ?package assets.styleTest
>>>
>>> Following command searches for non-ascii chars and if executed in 
>>> mustella folder it returns quite a big result.
>>>
>>> grep --color='auto' -I -R -P -n "^[\x80-\xFF]" *
>>>
>>> Are those characters really supposed to be there?
>>
>> See [1].  Falcon will have to learn to ignore them.
>>
>> [1] 
>> http://www.rgagnon.com/javadetails/java-handle-utf8-file-with-bom.htm
>> l
>>
>> --
>> Alex Harui
>> Flex SDK Team
>> Adobe Systems, Inc.
>> http://blogs.adobe.com/aharui
>>

Re: [FALCON] Problem with \uFEFF chararcters in mustella files

Posted by Paul Hastings <pa...@gmail.com>.
On 11/16/2012 4:58 PM, Cyrill Zadra wrote:
> I may have found the problem. It looks like there are files where the
> BOM isn't at the beginning of a file and thats the case where falcon
> can run into problems.

if the files are UTF-8 encoded, the BOM is optional & really has no meaning 
(there's only the one byte order direction in UTF-8). if a BOM is present it 
*has* to be at the start of the text stream. if the files are intended as ASCII 
then the BOM shouldn't be in there at all (ASCII chars are represented as 
themselves in UTF-8 anyway).

just saying...

Re: [FALCON] Problem with \uFEFF chararcters in mustella files

Posted by Alex Harui <ah...@adobe.com>.


On 11/16/12 1:58 AM, "Cyrill Zadra" <cy...@gmail.com> wrote:

> I may have found the problem. It looks like there are files where the
> BOM isn't at the beginning of a file and thats the case where falcon
> can run into problems.

Hah!  The magic of PERL scripts that replace headers.  OK, we will need to
clean that up somehow.  Any volunteers?
> 
> ActionScript Example
> 
> //////////////////////////////////////////////////////////////////////////////
> //
> //
> //  Apache License Header
> //
> //////////////////////////////////////////////////////////////////////////////
> //
> BOMpackage ...
> {
> 
> }
> 
> cyrill
> 


-- 
Alex Harui
Flex SDK Team
Adobe Systems, Inc.
http://blogs.adobe.com/aharui


Re: [FALCON] Problem with \uFEFF chararcters in mustella files

Posted by Cyrill Zadra <cy...@gmail.com>.
I may have found the problem. It looks like there are files where the
BOM isn't at the beginning of a file and thats the case where falcon
can run into problems.

ActionScript Example

////////////////////////////////////////////////////////////////////////////////
//
//  Apache License Header
//
////////////////////////////////////////////////////////////////////////////////
BOMpackage ...
{

}

cyrill

On Thu, Nov 15, 2012 at 10:44 PM, Cyrill Zadra <cy...@gmail.com> wrote:
> Thanks Alex. Just looked into falcon code and found.a few places where
> it already ignore the BOM. So it must be a rare exception. I'll take a
> look into that and try to find out in which exception it fails.
>
> cyrill
>
> On Thu, Nov 15, 2012 at 9:03 PM, Alex Harui <ah...@adobe.com> wrote:
>>
>>
>>
>> On 11/15/12 8:22 PM, "Cyrill Zadra" <cy...@gmail.com> wrote:
>>
>>> Hi
>>>
>>> In the mustella tests there are files that contains \uFEFF (just
>>> before the package declaration) and in some cases falcon has problem
>>> with this character and ends with the error:
>>>
>>> [java] Error: Unexpected character. '?' is not allowed here [java]
>>> ?package assets.styleTest
>>>
>>> Following command searches for non-ascii chars and if executed in
>>> mustella folder it returns quite a big result.
>>>
>>> grep --color='auto' -I -R -P -n "^[\x80-\xFF]" *
>>>
>>> Are those characters really supposed to be there?
>>
>> See [1].  Falcon will have to learn to ignore them.
>>
>> [1] http://www.rgagnon.com/javadetails/java-handle-utf8-file-with-bom.html
>>
>> --
>> Alex Harui
>> Flex SDK Team
>> Adobe Systems, Inc.
>> http://blogs.adobe.com/aharui
>>

Re: [FALCON] Problem with \uFEFF chararcters in mustella files

Posted by Cyrill Zadra <cy...@gmail.com>.
Thanks Alex. Just looked into falcon code and found.a few places where
it already ignore the BOM. So it must be a rare exception. I'll take a
look into that and try to find out in which exception it fails.

cyrill

On Thu, Nov 15, 2012 at 9:03 PM, Alex Harui <ah...@adobe.com> wrote:
>
>
>
> On 11/15/12 8:22 PM, "Cyrill Zadra" <cy...@gmail.com> wrote:
>
>> Hi
>>
>> In the mustella tests there are files that contains \uFEFF (just
>> before the package declaration) and in some cases falcon has problem
>> with this character and ends with the error:
>>
>> [java] Error: Unexpected character. '?' is not allowed here [java]
>> ?package assets.styleTest
>>
>> Following command searches for non-ascii chars and if executed in
>> mustella folder it returns quite a big result.
>>
>> grep --color='auto' -I -R -P -n "^[\x80-\xFF]" *
>>
>> Are those characters really supposed to be there?
>
> See [1].  Falcon will have to learn to ignore them.
>
> [1] http://www.rgagnon.com/javadetails/java-handle-utf8-file-with-bom.html
>
> --
> Alex Harui
> Flex SDK Team
> Adobe Systems, Inc.
> http://blogs.adobe.com/aharui
>

Re: [FALCON] Problem with \uFEFF chararcters in mustella files

Posted by Alex Harui <ah...@adobe.com>.


On 11/15/12 8:22 PM, "Cyrill Zadra" <cy...@gmail.com> wrote:

> Hi
> 
> In the mustella tests there are files that contains \uFEFF (just
> before the package declaration) and in some cases falcon has problem
> with this character and ends with the error:
> 
> [java] Error: Unexpected character. '?' is not allowed here [java]
> ?package assets.styleTest
> 
> Following command searches for non-ascii chars and if executed in
> mustella folder it returns quite a big result.
> 
> grep --color='auto' -I -R -P -n "^[\x80-\xFF]" *
> 
> Are those characters really supposed to be there?

See [1].  Falcon will have to learn to ignore them.

[1] http://www.rgagnon.com/javadetails/java-handle-utf8-file-with-bom.html

-- 
Alex Harui
Flex SDK Team
Adobe Systems, Inc.
http://blogs.adobe.com/aharui