You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ant.apache.org by Jerry Chimey <je...@yahoo.com> on 2008/11/16 23:48:12 UTC

ANT replaceregexp problem

Hi,
   I am seeing a weird problem of using replaceregexp in ANT.
Basically, for the non-English characters, they are updated even though they are not matched by the regular expression.

The following is my original input:
-------------<replace.xml before> ------
 <?xml version="1.0" encoding="UTF-8"?>
                 <web-app>  
                    <url>file://localhost/$server_root$/deployed/archive/wcm.contentviewer.1001/ilwwcm-localrendering-portlet.war</url>                                 <title>â€Ø¨Ø¯ÙˆÙ† شكل عامâ€</title>
                 </web-app>
------------------------------------------------

   My original intension is ONLY to update the content related to <url> tag. The following is the code:
    <target name ="testRegex">
        <replaceregexp file="replace.xml"
                         match="file://localhost/\$server_root\$/deployed/archive/[a-zA-Z.0-9]+/"
                         replace="file://localhost/$server_root$/installableApps/"
                         byline="yes"/>
    
    </target>

After I executed this target, I got the following result:
-----------------<replace.xml after>------------
                <?xml version="1.0" encoding="UTF-8"?>
                 <web-app>  
                    <url>file://localhost/$server_root$/installableApps/ilwwcm-localrendering-portlet.war</url>
                    <title>�بدون شكل عام�</title>
                 </web-app>
----------------------------------------------

The url was replaced correctly; however, some characters in the title tag were replaced with '?'.   
I  used ANT 1.7 and  tried to use jakarta-oro-2.0.8 to perform regular expression replacement but still got the same problem.

C:\>ant -Dant.regexp.regexpimpl=org.apache.tools
.ant.util.regexp.JakartaOroRegexp -f test.xml testRegex
Buildfile: test.xml

testRegex:

BUILD SUCCESSFUL
Total time: 0 seconds

Have anyone seen this problem and have any idea how to fix it?

Thanks

Jerry


      

Re: ANT replaceregexp problem

Posted by Jerry Chimey <je...@yahoo.com>.
Hi, Brian:
    You are right.  After setting the file.encoding, it is working for now. I agree with you that this is not a good solution for XML replacement.
I am going to try XML Task or XSLT to do this type of work.

Thanks again.

Jerry





________________________________
From: Brian Agnew <br...@oopsconsultancy.com>
To: Ant Users List <us...@ant.apache.org>
Sent: Monday, November 17, 2008 4:02:08 AM
Subject: Re: ANT replaceregexp problem

Your XML file specifies the char encoding being used (UTF-8), but I'm
guessing that the replaceregexp task won't understand this, since it's not
XML-aware. So your replacement string may be getting written out using
another encoding - most likely your default environment encoding.

Try setting -Dfile.encoding=utf8 (or it might be utf-8, or similar. You
get the idea).

Note that if you want to do XML replacement, and you need to maintain
encodings, then XMLTask may be a better solution.

Brian

On Sun, November 16, 2008 22:48, Jerry Chimey wrote:
> Hi,
>    I am seeing a weird problem of using replaceregexp in ANT.
> Basically, for the non-English characters, they are updated even though
> they are not matched by the regular expression.
>
> The following is my original input:
> -------------<replace.xml before> ------
>  <?xml version="1.0" encoding="UTF-8"?>
>                  <web-app>
>                     <url>file://localhost/$server_root$/deployed/archive/wcm.contentviewer.1001/ilwwcm-localrendering-portlet.war</url>
>                                 <title>â€Ø¨Ø¯ÙˆÙ†
> شكل عامâ€</title>
>                  </web-app>
> ------------------------------------------------
>
>    My original intension is ONLY to update the content related to <url>
> tag. The following is the code:
>     <target name ="testRegex">
>         <replaceregexp file="replace.xml"
>                          match="file://localhost/\$server_root\$/deployed/archive/[a-zA-Z.0-9]+/"
>                          replace="file://localhost/$server_root$/installableApps/"
>                          byline="yes"/>
>
>     </target>
>
> After I executed this target, I got the following result:
> -----------------<replace.xml after>------------
>                 <?xml version="1.0" encoding="UTF-8"?>
>                  <web-app>
>                     <url>file://localhost/$server_root$/installableApps/ilwwcm-localrendering-portlet.war</url>
>                     <title>�بدون شكل عام�</title>
>                  </web-app>
> ----------------------------------------------
>
> The url was replaced correctly; however, some characters in the title tag
> were replaced with '?'.
> I  used ANT 1.7 and  tried to use jakarta-oro-2.0.8 to perform regular
> expression replacement but still got the same problem.
>
> C:\>ant -Dant.regexp.regexpimpl=org.apache.tools
> .ant.util.regexp.JakartaOroRegexp -f test.xml testRegex
> Buildfile: test.xml
>
> testRegex:
>
> BUILD SUCCESSFUL
> Total time: 0 seconds
>
> Have anyone seen this problem and have any idea how to fix it?
>
> Thanks
>
> Jerry
>
>
>


-- 
Brian Agnew                  http://www.oopsconsultancy.com
OOPS Consultancy Ltd
Tel: +44 (0)7720 397526
Fax: +44 (0)20 8682 0012


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@ant.apache.org
For additional commands, e-mail: user-help@ant.apache.org


      

Re: ANT replaceregexp problem

Posted by Brian Agnew <br...@oopsconsultancy.com>.
Your XML file specifies the char encoding being used (UTF-8), but I'm
guessing that the replaceregexp task won't understand this, since it's not
XML-aware. So your replacement string may be getting written out using
another encoding - most likely your default environment encoding.

Try setting -Dfile.encoding=utf8 (or it might be utf-8, or similar. You
get the idea).

Note that if you want to do XML replacement, and you need to maintain
encodings, then XMLTask may be a better solution.

Brian

On Sun, November 16, 2008 22:48, Jerry Chimey wrote:
> Hi,
>    I am seeing a weird problem of using replaceregexp in ANT.
> Basically, for the non-English characters, they are updated even though
> they are not matched by the regular expression.
>
> The following is my original input:
> -------------<replace.xml before> ------
>  <?xml version="1.0" encoding="UTF-8"?>
>                  <web-app>
>                     <url>file://localhost/$server_root$/deployed/archive/wcm.contentviewer.1001/ilwwcm-localrendering-portlet.war</url>
>                                 <title>â€Ø¨Ø¯ÙˆÙ†
> شكل عامâ€</title>
>                  </web-app>
> ------------------------------------------------
>
>    My original intension is ONLY to update the content related to <url>
> tag. The following is the code:
>     <target name ="testRegex">
>         <replaceregexp file="replace.xml"
>                          match="file://localhost/\$server_root\$/deployed/archive/[a-zA-Z.0-9]+/"
>                          replace="file://localhost/$server_root$/installableApps/"
>                          byline="yes"/>
>
>     </target>
>
> After I executed this target, I got the following result:
> -----------------<replace.xml after>------------
>                 <?xml version="1.0" encoding="UTF-8"?>
>                  <web-app>
>                     <url>file://localhost/$server_root$/installableApps/ilwwcm-localrendering-portlet.war</url>
>                     <title>â€?بدون شكل عامâ€?</title>
>                  </web-app>
> ----------------------------------------------
>
> The url was replaced correctly; however, some characters in the title tag
> were replaced with '?'.
> I  used ANT 1.7 and  tried to use jakarta-oro-2.0.8 to perform regular
> expression replacement but still got the same problem.
>
> C:\>ant -Dant.regexp.regexpimpl=org.apache.tools
> .ant.util.regexp.JakartaOroRegexp -f test.xml testRegex
> Buildfile: test.xml
>
> testRegex:
>
> BUILD SUCCESSFUL
> Total time: 0 seconds
>
> Have anyone seen this problem and have any idea how to fix it?
>
> Thanks
>
> Jerry
>
>
>


-- 
Brian Agnew                  http://www.oopsconsultancy.com
OOPS Consultancy Ltd
Tel: +44 (0)7720 397526
Fax: +44 (0)20 8682 0012


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@ant.apache.org
For additional commands, e-mail: user-help@ant.apache.org