You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ant.apache.org by Peter Desjardins <pe...@gmail.com> on 2011/03/11 19:26:28 UTC

Regular expression to match non-breaking spaces

Hi. I'm trying to replace every non-breaking space character in a file
with a normal space character. The source file is encoded using UTF-8.
I can't get this to work using the replaceregexp task. Here's the
syntax I'm using:

  <replaceregexp file="myfile.xml" match="&#xa0;" replace=" "
flags="g" byline="true" encoding="utf-8" />

I've tried other ways to match non-breaking spaces (&#160; \0240
U+00A0) but have had no success yet. I didn't alter the regular
expression libraries in my Ant installation. Is that what I need to
do? Or is there a string I can put into the match attribute that will
match UTF-8 non-breaking spaces?

Thanks for your help.

Peter Desjardins

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@ant.apache.org
For additional commands, e-mail: user-help@ant.apache.org


Re: Regular expression to match non-breaking spaces

Posted by wolfgang haefelinger <wh...@gmail.com>.
Peter,

>>   <replaceregexp file="myfile.xml" match="&#xa0;" replace=" "
>> flags="g" byline="true" encoding="utf-8" />

I assume that Ant's XML parse will digest your character reference. In
effect you are doing nothing more than

<replaceregexp file="myfile.xml" match=" " replace=" " flags="g"
byline="true" encoding="utf-8" />

Have you tried something like

<replaceregexp file="myfile.xml" match=" (&)#xa0;" replace=" "
flags="g" byline="true" encoding="utf-8" />

instead?

// Wolfgang

On Sat, Mar 12, 2011 at 12:48 AM, Brian Agnew <br...@oopsconsultancy.com> wrote:
> You might want to try XMLTask, which reads XML natively and handles the
> specified character encodings. It'll do regexp replacements.
>
> http://www.oopsconsultancy.com/software/xmltask
>
> On 11/03/2011 18:26, Peter Desjardins wrote:
>>
>> Hi. I'm trying to replace every non-breaking space character in a file
>> with a normal space character. The source file is encoded using UTF-8.
>> I can't get this to work using the replaceregexp task. Here's the
>> syntax I'm using:
>>
>>   <replaceregexp file="myfile.xml" match="&#xa0;" replace=" "
>> flags="g" byline="true" encoding="utf-8" />
>>
>> I've tried other ways to match non-breaking spaces (&#160; \0240
>> U+00A0) but have had no success yet. I didn't alter the regular
>> expression libraries in my Ant installation. Is that what I need to
>> do? Or is there a string I can put into the match attribute that will
>> match UTF-8 non-breaking spaces?
>>
>> Thanks for your help.
>>
>> Peter Desjardins
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@ant.apache.org
>> For additional commands, e-mail: user-help@ant.apache.org
>>
>
> --
> Brian Agnew                  http://www.oopsconsultancy.com
> OOPS Consultancy Ltd
> Tel: +44 (0)7720 397526
> Fax: +44 (0)20 8682 0012
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@ant.apache.org
> For additional commands, e-mail: user-help@ant.apache.org
>
>



-- 
Wolfgang Häfelinger
häfelinger IT - Applied Software Architecture
http://www.haefelinger.it
+31 648 27 61 59

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@ant.apache.org
For additional commands, e-mail: user-help@ant.apache.org


Re: Regular expression to match non-breaking spaces

Posted by Brian Agnew <br...@oopsconsultancy.com>.
You might want to try XMLTask, which reads XML natively and handles the 
specified character encodings. It'll do regexp replacements.

http://www.oopsconsultancy.com/software/xmltask

On 11/03/2011 18:26, Peter Desjardins wrote:
> Hi. I'm trying to replace every non-breaking space character in a file
> with a normal space character. The source file is encoded using UTF-8.
> I can't get this to work using the replaceregexp task. Here's the
> syntax I'm using:
>
>    <replaceregexp file="myfile.xml" match="&#xa0;" replace=" "
> flags="g" byline="true" encoding="utf-8" />
>
> I've tried other ways to match non-breaking spaces (&#160; \0240
> U+00A0) but have had no success yet. I didn't alter the regular
> expression libraries in my Ant installation. Is that what I need to
> do? Or is there a string I can put into the match attribute that will
> match UTF-8 non-breaking spaces?
>
> Thanks for your help.
>
> Peter Desjardins
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@ant.apache.org
> For additional commands, e-mail: user-help@ant.apache.org
>

-- 
Brian Agnew                  http://www.oopsconsultancy.com
OOPS Consultancy Ltd
Tel: +44 (0)7720 397526
Fax: +44 (0)20 8682 0012


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@ant.apache.org
For additional commands, e-mail: user-help@ant.apache.org