You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@maven.apache.org by "Arnaud Heritier (JIRA)" <ji...@codehaus.org> on 2006/01/15 01:41:01 UTC
[jira] Closed: (MPLINKCHECK-23) Improve linkcheck performance (2x+)
getting rid of jtidy dependency via regexps
[ http://jira.codehaus.org/browse/MPLINKCHECK-23?page=all ]
Arnaud Heritier closed MPLINKCHECK-23:
--------------------------------------
Assign To: Arnaud Heritier
Resolution: Fixed
Fix Version: 1.4
Applied. Thanks a lot.
> Improve linkcheck performance (2x+) getting rid of jtidy dependency via regexps
> -------------------------------------------------------------------------------
>
> Key: MPLINKCHECK-23
> URL: http://jira.codehaus.org/browse/MPLINKCHECK-23
> Project: maven-linkcheck-plugin
> Type: Improvement
> Versions: 1.3.4
> Reporter: Ignacio G. Mac Dowell
> Assignee: Arnaud Heritier
> Fix For: 1.4
> Attachments: linkcheck.patch
>
>
> At the moment, the linkcheck plugin uses jtidy and xpath for retreiving all links. IMHO regexps would work much faster/better than jtidy-xpath combination.
> The following regexp would be a replacement for the xpath expressions:
> <(?>link|a|img|script)[^>]*?(?>href|src)\s*?=\s*?[\"'](.*?)[\"'][^>]*?
> All tests pass with this regexp and in project ws-jaxme I am getting these results for maven-linkcheck-plugin:clearcache maven-linkcheck-plugin:report-real:
> with jtidy/xpath: Total time: 2 minutes 43 seconds
> with regexps: Total time: 1 minutes 10 seconds
> I am sure some regexp guru can improve the performance of this.
> I have a question, though. Are mailto links supposed to count as checkable? IMO no.
> PD: Also, IMO the createDocument method from LinkCheck should be on a try finally block.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org