You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jspwiki.apache.org by "Janne Jalkanen (JIRA)" <ji...@apache.org> on 2008/06/12 23:00:46 UTC

[jira] Created: (JSPWIKI-291) Replace ORO regexp library with Java 5 regexps

Replace ORO regexp library with Java 5 regexps
----------------------------------------------

                 Key: JSPWIKI-291
                 URL: https://issues.apache.org/jira/browse/JSPWIKI-291
             Project: JSPWiki
          Issue Type: Task
            Reporter: Janne Jalkanen
            Priority: Trivial


Now that Java has a good regexp library, it might be a good idea to get rid of the oro library, as it reduces our dependencies of external libraries.  This should be a relatively easy task, except when Glob patterns are being used, or when oro-specific regexps are used.

No particular release is targeted; whenever we got time.  This would be a nice low-hanging fruit for someone to start contributing to JSPWiki with...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JSPWIKI-291) Replace ORO regexp library with Java 5 regexps

Posted by "Janne Jalkanen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JSPWIKI-291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657604#action_12657604 ] 

Janne Jalkanen commented on JSPWIKI-291:
----------------------------------------

The reason why testCCLinkWithScandics() fails is that in Java, \p{Upper} matches only US-ASCII.  You need to match against the Unicode upper case class, which is \p{Lu}.

Check out the java.util.regexp.Pattern javadoc, java.lang.Character javadoc, and also the Unicode categories at http://www.unicode.org/versions/Unicode4.0.0/ch04.pdf, that should all make it clear :-)



> Replace ORO regexp library with Java 5 regexps
> ----------------------------------------------
>
>                 Key: JSPWIKI-291
>                 URL: https://issues.apache.org/jira/browse/JSPWIKI-291
>             Project: JSPWiki
>          Issue Type: Task
>          Components: Core & storage, Filters, Plugins
>            Reporter: Janne Jalkanen
>            Assignee: Harry Metske
>            Priority: Trivial
>             Fix For: 3.0
>
>         Attachments: denouncePlugin.patch, JSPWiki-291.patch
>
>
> Now that Java has a good regexp library, it might be a good idea to get rid of the oro library, as it reduces our dependencies of external libraries.  This should be a relatively easy task, except when Glob patterns are being used, or when oro-specific regexps are used.
> No particular release is targeted; whenever we got time.  This would be a nice low-hanging fruit for someone to start contributing to JSPWiki with...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JSPWIKI-291) Replace ORO regexp library with Java 5 regexps

Posted by "Harry Metske (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JSPWIKI-291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657689#action_12657689 ] 

Harry Metske commented on JSPWIKI-291:
--------------------------------------

Thanks a lot, I'll try to fix it this week.

> Replace ORO regexp library with Java 5 regexps
> ----------------------------------------------
>
>                 Key: JSPWIKI-291
>                 URL: https://issues.apache.org/jira/browse/JSPWIKI-291
>             Project: JSPWiki
>          Issue Type: Task
>          Components: Core & storage, Filters, Plugins
>            Reporter: Janne Jalkanen
>            Assignee: Harry Metske
>            Priority: Trivial
>             Fix For: 3.0
>
>         Attachments: denouncePlugin.patch, JSPWiki-291.patch
>
>
> Now that Java has a good regexp library, it might be a good idea to get rid of the oro library, as it reduces our dependencies of external libraries.  This should be a relatively easy task, except when Glob patterns are being used, or when oro-specific regexps are used.
> No particular release is targeted; whenever we got time.  This would be a nice low-hanging fruit for someone to start contributing to JSPWiki with...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JSPWIKI-291) Replace ORO regexp library with Java 5 regexps

Posted by "Harry Metske (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JSPWIKI-291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12656290#action_12656290 ] 

Harry Metske commented on JSPWIKI-291:
--------------------------------------

I started "converting" the AbstractReferrealPlugin, it uses regexps for the include and exclude parameters.
At least that's what is documented at http://www.jspwiki.org/wiki/AbstractReferralPlugin.
However, using oro for the patterns we do not only allow regexps but also globbing and more (as Janne already expected).
So, the JUnit test fails because it uses '*7' as a pattern, which is a valid glob pattern but _not_ valid regex pattern. (changing all '*'  to  '.*'  makes the JUnit test succeed again).

So what do we do with this ?
Accept the incompatibility with the current behavior, and document that we now really require a valid regexp ?


> Replace ORO regexp library with Java 5 regexps
> ----------------------------------------------
>
>                 Key: JSPWIKI-291
>                 URL: https://issues.apache.org/jira/browse/JSPWIKI-291
>             Project: JSPWiki
>          Issue Type: Task
>          Components: Core & storage, Filters, Plugins
>            Reporter: Janne Jalkanen
>            Priority: Trivial
>             Fix For: 3.0
>
>         Attachments: denouncePlugin.patch
>
>
> Now that Java has a good regexp library, it might be a good idea to get rid of the oro library, as it reduces our dependencies of external libraries.  This should be a relatively easy task, except when Glob patterns are being used, or when oro-specific regexps are used.
> No particular release is targeted; whenever we got time.  This would be a nice low-hanging fruit for someone to start contributing to JSPWiki with...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (JSPWIKI-291) Replace ORO regexp library with Java 5 regexps

Posted by "Thomas Engelschmidt (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JSPWIKI-291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thomas Engelschmidt updated JSPWIKI-291:
----------------------------------------

    Attachment: denouncePlugin.patch

DenouncePlugin :
One small patch migrating from oro to java.util.regex

Developed against the trunk branch

> Replace ORO regexp library with Java 5 regexps
> ----------------------------------------------
>
>                 Key: JSPWIKI-291
>                 URL: https://issues.apache.org/jira/browse/JSPWIKI-291
>             Project: JSPWiki
>          Issue Type: Task
>          Components: Core & storage, Filters, Plugins
>            Reporter: Janne Jalkanen
>            Priority: Trivial
>             Fix For: 3.0
>
>         Attachments: denouncePlugin.patch
>
>
> Now that Java has a good regexp library, it might be a good idea to get rid of the oro library, as it reduces our dependencies of external libraries.  This should be a relatively easy task, except when Glob patterns are being used, or when oro-specific regexps are used.
> No particular release is targeted; whenever we got time.  This would be a nice low-hanging fruit for someone to start contributing to JSPWiki with...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JSPWIKI-291) Replace ORO regexp library with Java 5 regexps

Posted by "Harry Metske (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JSPWIKI-291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12658176#action_12658176 ] 

Harry Metske commented on JSPWIKI-291:
--------------------------------------

Janne, any ideas ? :

I have tried all 4 options (the first, as said earlier, the wrong one), but the others seem to fail because of something with the wiki pageName ?


\p{Upper}                      Onko tämä hyperlinkki: ÄitiSyöÖljyä?
\p{javaUpperCase}     Onko tämä hyperlinkki: <a class="createpage" href="/Edit.jsp?page=%C4itiSy" title="Create &quot;ÄitiSy&quot;">ÄitiSy</a>öÖljyä?
\p{Lu}                            Onko tämä hyperlinkki: <a class="createpage" href="/Edit.jsp?page=%C4itiSy" title="Create &quot;ÄitiSy&quot;">ÄitiSy</a>öÖljyä?

The expected Result for the test is:

       Onko tämä hyperlinkki: <a class="wikipage" href="/Wiki.jsp?page=%C4itiSy%F6%D6ljy%E4">ÄitiSyöÖljyä</a>?

> Replace ORO regexp library with Java 5 regexps
> ----------------------------------------------
>
>                 Key: JSPWIKI-291
>                 URL: https://issues.apache.org/jira/browse/JSPWIKI-291
>             Project: JSPWiki
>          Issue Type: Task
>          Components: Core & storage, Filters, Plugins
>            Reporter: Janne Jalkanen
>            Assignee: Harry Metske
>            Priority: Trivial
>             Fix For: 3.0
>
>         Attachments: denouncePlugin.patch, JSPWiki-291.patch
>
>
> Now that Java has a good regexp library, it might be a good idea to get rid of the oro library, as it reduces our dependencies of external libraries.  This should be a relatively easy task, except when Glob patterns are being used, or when oro-specific regexps are used.
> No particular release is targeted; whenever we got time.  This would be a nice low-hanging fruit for someone to start contributing to JSPWiki with...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JSPWIKI-291) Replace ORO regexp library with Java 5 regexps

Posted by "Janne Jalkanen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JSPWIKI-291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657209#action_12657209 ] 

Janne Jalkanen commented on JSPWIKI-291:
----------------------------------------

Yup.  http://svn.apache.org/viewvc/jakarta/oro/trunk/src/java/org/apache/oro/text/GlobCompiler.java?revision=124053&view=markup

And method globToPerl5().

Anybody want to extract that code, reformat to JSPWiki rules, and stick it into a RegexpUtil class?

> Replace ORO regexp library with Java 5 regexps
> ----------------------------------------------
>
>                 Key: JSPWIKI-291
>                 URL: https://issues.apache.org/jira/browse/JSPWIKI-291
>             Project: JSPWiki
>          Issue Type: Task
>          Components: Core & storage, Filters, Plugins
>            Reporter: Janne Jalkanen
>            Priority: Trivial
>             Fix For: 3.0
>
>         Attachments: denouncePlugin.patch
>
>
> Now that Java has a good regexp library, it might be a good idea to get rid of the oro library, as it reduces our dependencies of external libraries.  This should be a relatively easy task, except when Glob patterns are being used, or when oro-specific regexps are used.
> No particular release is targeted; whenever we got time.  This would be a nice low-hanging fruit for someone to start contributing to JSPWiki with...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JSPWIKI-291) Replace ORO regexp library with Java 5 regexps

Posted by "Janne Jalkanen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JSPWIKI-291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657605#action_12657605 ] 

Janne Jalkanen commented on JSPWIKI-291:
----------------------------------------

Otherwise, the patch looks fine to me!  You might want to add a globToPerl5() version which accepts a String as opposed to a char array - it's a bit cleaner to use that way.

> Replace ORO regexp library with Java 5 regexps
> ----------------------------------------------
>
>                 Key: JSPWIKI-291
>                 URL: https://issues.apache.org/jira/browse/JSPWIKI-291
>             Project: JSPWiki
>          Issue Type: Task
>          Components: Core & storage, Filters, Plugins
>            Reporter: Janne Jalkanen
>            Assignee: Harry Metske
>            Priority: Trivial
>             Fix For: 3.0
>
>         Attachments: denouncePlugin.patch, JSPWiki-291.patch
>
>
> Now that Java has a good regexp library, it might be a good idea to get rid of the oro library, as it reduces our dependencies of external libraries.  This should be a relatively easy task, except when Glob patterns are being used, or when oro-specific regexps are used.
> No particular release is targeted; whenever we got time.  This would be a nice low-hanging fruit for someone to start contributing to JSPWiki with...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JSPWIKI-291) Replace ORO regexp library with Java 5 regexps

Posted by "Janne Jalkanen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JSPWIKI-291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657611#action_12657611 ] 

Janne Jalkanen commented on JSPWIKI-291:
----------------------------------------

You could also use the \p{javaXXX} syntax, again, from Pattern javadoc.

"Categories that behave like the java.lang.Character boolean ismethodname methods (except for the deprecated ones) are available through the same \p{prop} syntax where the specified property has the name javamethodname. "

> Replace ORO regexp library with Java 5 regexps
> ----------------------------------------------
>
>                 Key: JSPWIKI-291
>                 URL: https://issues.apache.org/jira/browse/JSPWIKI-291
>             Project: JSPWiki
>          Issue Type: Task
>          Components: Core & storage, Filters, Plugins
>            Reporter: Janne Jalkanen
>            Assignee: Harry Metske
>            Priority: Trivial
>             Fix For: 3.0
>
>         Attachments: denouncePlugin.patch, JSPWiki-291.patch
>
>
> Now that Java has a good regexp library, it might be a good idea to get rid of the oro library, as it reduces our dependencies of external libraries.  This should be a relatively easy task, except when Glob patterns are being used, or when oro-specific regexps are used.
> No particular release is targeted; whenever we got time.  This would be a nice low-hanging fruit for someone to start contributing to JSPWiki with...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JSPWIKI-291) Replace ORO regexp library with Java 5 regexps

Posted by "Janne Jalkanen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JSPWIKI-291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653226#action_12653226 ] 

Janne Jalkanen commented on JSPWIKI-291:
----------------------------------------

Thank you, Thomas!  The patch is included now in 3.0.0-svn-22.

> Replace ORO regexp library with Java 5 regexps
> ----------------------------------------------
>
>                 Key: JSPWIKI-291
>                 URL: https://issues.apache.org/jira/browse/JSPWIKI-291
>             Project: JSPWiki
>          Issue Type: Task
>          Components: Core & storage, Filters, Plugins
>            Reporter: Janne Jalkanen
>            Priority: Trivial
>             Fix For: 3.0
>
>         Attachments: denouncePlugin.patch
>
>
> Now that Java has a good regexp library, it might be a good idea to get rid of the oro library, as it reduces our dependencies of external libraries.  This should be a relatively easy task, except when Glob patterns are being used, or when oro-specific regexps are used.
> No particular release is targeted; whenever we got time.  This would be a nice low-hanging fruit for someone to start contributing to JSPWiki with...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JSPWIKI-291) Replace ORO regexp library with Java 5 regexps

Posted by "Harry Metske (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JSPWIKI-291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657399#action_12657399 ] 

Harry Metske commented on JSPWIKI-291:
--------------------------------------

Yes I know, I already have this working, that problem is solved.

I'm now struggling with the JSPWikiMarkupParser which has the WIKIWORD_REGEX, which is not a Glob expression nor a Perl5 regex.
I still have I think about 4 JUnit tests (of 120 !) failing. I might need some help here with this regex stuff......

regards,
Harry

> Replace ORO regexp library with Java 5 regexps
> ----------------------------------------------
>
>                 Key: JSPWIKI-291
>                 URL: https://issues.apache.org/jira/browse/JSPWIKI-291
>             Project: JSPWiki
>          Issue Type: Task
>          Components: Core & storage, Filters, Plugins
>            Reporter: Janne Jalkanen
>            Priority: Trivial
>             Fix For: 3.0
>
>         Attachments: denouncePlugin.patch
>
>
> Now that Java has a good regexp library, it might be a good idea to get rid of the oro library, as it reduces our dependencies of external libraries.  This should be a relatively easy task, except when Glob patterns are being used, or when oro-specific regexps are used.
> No particular release is targeted; whenever we got time.  This would be a nice low-hanging fruit for someone to start contributing to JSPWiki with...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JSPWIKI-291) Replace ORO regexp library with Java 5 regexps

Posted by "Janne Jalkanen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JSPWIKI-291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12656639#action_12656639 ] 

Janne Jalkanen commented on JSPWIKI-291:
----------------------------------------

I think it would be dangerous to change the patterns in any public config files or wikipages.  It could have adverse effects which would be invisible to the user.   (For example, if someone had a pattern like "Bug*", it would mean completely different things...)

However, it's probably not very complicated to make a Glob-to-Perl5 regexp translator.  I think that's how ORO handles them internally - and since it's Apache code, we could easily lift that bit of code from it.

Globs are also useful because they are so much more user-friendly than Perl5 regexps.  They're not as powerful, but they are much simpler to explain.

> Replace ORO regexp library with Java 5 regexps
> ----------------------------------------------
>
>                 Key: JSPWIKI-291
>                 URL: https://issues.apache.org/jira/browse/JSPWIKI-291
>             Project: JSPWiki
>          Issue Type: Task
>          Components: Core & storage, Filters, Plugins
>            Reporter: Janne Jalkanen
>            Priority: Trivial
>             Fix For: 3.0
>
>         Attachments: denouncePlugin.patch
>
>
> Now that Java has a good regexp library, it might be a good idea to get rid of the oro library, as it reduces our dependencies of external libraries.  This should be a relatively easy task, except when Glob patterns are being used, or when oro-specific regexps are used.
> No particular release is targeted; whenever we got time.  This would be a nice low-hanging fruit for someone to start contributing to JSPWiki with...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Closed: (JSPWIKI-291) Replace ORO regexp library with Java 5 regexps

Posted by "Harry Metske (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JSPWIKI-291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harry Metske closed JSPWIKI-291.
--------------------------------

    Resolution: Fixed

thanks Janne, fixed in 3.0.0-svn-37.

> Replace ORO regexp library with Java 5 regexps
> ----------------------------------------------
>
>                 Key: JSPWIKI-291
>                 URL: https://issues.apache.org/jira/browse/JSPWIKI-291
>             Project: JSPWiki
>          Issue Type: Task
>          Components: Core & storage, Filters, Plugins
>            Reporter: Janne Jalkanen
>            Assignee: Harry Metske
>            Priority: Trivial
>             Fix For: 3.0
>
>         Attachments: denouncePlugin.patch, JSPWIKI-291-JSPWikiMarkupParserOnly.txt, JSPWiki-291.patch
>
>
> Now that Java has a good regexp library, it might be a good idea to get rid of the oro library, as it reduces our dependencies of external libraries.  This should be a relatively easy task, except when Glob patterns are being used, or when oro-specific regexps are used.
> No particular release is targeted; whenever we got time.  This would be a nice low-hanging fruit for someone to start contributing to JSPWiki with...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (JSPWIKI-291) Replace ORO regexp library with Java 5 regexps

Posted by "Harry Metske (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JSPWIKI-291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harry Metske updated JSPWIKI-291:
---------------------------------

    Attachment: JSPWiki-291.patch

Oke, a short summary, the following classes were patched :

- SpamFilter, difficult to test for me
- JSPWikiMarkupParser, the heart and soul of JSPWiki, it now only fails the testCCLinkWithScandics() method now, probably has something to do with code pages => can somebody have a look at this ?
- AbstractReferralPlugin, fixed, it now uses com.ecyrd.jspwiki.util.RegExpUtil.globToPerl5() to convert globs to perl5 regexp
- IfPlugin
- PluginManager
- ReferredPagesPlugin (I created a JUnit test for it, it didn't have one yet)
- RCSFileProvider

New classes:
- RegExpUtil, new class lifted over from the oro package (any special licensing issues here ?)
- ReferredPagesPluginTest

Deleted the oro.jar from the classpath.

I'd like some review/testing here before committing.
I attached the patch
regards,
Harry

> Replace ORO regexp library with Java 5 regexps
> ----------------------------------------------
>
>                 Key: JSPWIKI-291
>                 URL: https://issues.apache.org/jira/browse/JSPWIKI-291
>             Project: JSPWiki
>          Issue Type: Task
>          Components: Core & storage, Filters, Plugins
>            Reporter: Janne Jalkanen
>            Priority: Trivial
>             Fix For: 3.0
>
>         Attachments: denouncePlugin.patch, JSPWiki-291.patch
>
>
> Now that Java has a good regexp library, it might be a good idea to get rid of the oro library, as it reduces our dependencies of external libraries.  This should be a relatively easy task, except when Glob patterns are being used, or when oro-specific regexps are used.
> No particular release is targeted; whenever we got time.  This would be a nice low-hanging fruit for someone to start contributing to JSPWiki with...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (JSPWIKI-291) Replace ORO regexp library with Java 5 regexps

Posted by "Janne Jalkanen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JSPWIKI-291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Janne Jalkanen updated JSPWIKI-291:
-----------------------------------

    Attachment: JSPWIKI-291-JSPWikiMarkupParserOnly.txt

This patch works for me (and also makes the regexp far easier to read).  It is meant for JSPWikiMarkupParser class only, so you may have to do some creative merging :-)

> Replace ORO regexp library with Java 5 regexps
> ----------------------------------------------
>
>                 Key: JSPWIKI-291
>                 URL: https://issues.apache.org/jira/browse/JSPWIKI-291
>             Project: JSPWiki
>          Issue Type: Task
>          Components: Core & storage, Filters, Plugins
>            Reporter: Janne Jalkanen
>            Assignee: Harry Metske
>            Priority: Trivial
>             Fix For: 3.0
>
>         Attachments: denouncePlugin.patch, JSPWIKI-291-JSPWikiMarkupParserOnly.txt, JSPWiki-291.patch
>
>
> Now that Java has a good regexp library, it might be a good idea to get rid of the oro library, as it reduces our dependencies of external libraries.  This should be a relatively easy task, except when Glob patterns are being used, or when oro-specific regexps are used.
> No particular release is targeted; whenever we got time.  This would be a nice low-hanging fruit for someone to start contributing to JSPWiki with...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.