You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael McCandless (JIRA)" <ji...@apache.org> on 2007/08/09 00:30:59 UTC
[jira] Resolved: (LUCENE-966) A faster JFlex-based replacement for
StandardAnalyzer
[ https://issues.apache.org/jira/browse/LUCENE-966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael McCandless resolved LUCENE-966.
---------------------------------------
Resolution: Fixed
Lucene Fields: [New, Patch Available] (was: [New])
OK I committed this! Thank you Stanislaw!
I ran a quick perf test on Wikipedia (first 50K docs only) and found
the new StandardTokenizer is ~6X faster -- awesome :)
I made these small additional changes over the final patch before
committing:
* I removed StandardAnalyzer.html "grammar doc" generation from
build.xml since it was using jjdoc. Stanislaw, is there something
in jflex that can generated a BNF description of the grammar as
HTML?
* I removed the @author tag from StandardTokenizer.java: we are
removing all such tags and instead giving credit in CHANGES.txt.
* I removed the whitespace-only diffs from common-build.xml &
build.xml.
* I put back the big comment that describes this tokenizer in
StandardTokenizer.java.
* Put standard Apache copyright headers in all sources.
> A faster JFlex-based replacement for StandardAnalyzer
> -----------------------------------------------------
>
> Key: LUCENE-966
> URL: https://issues.apache.org/jira/browse/LUCENE-966
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Analysis
> Reporter: Stanislaw Osinski
> Fix For: 2.3
>
> Attachments: AnalyzerBenchmark.java, jflex-analyzer-patch.txt, jflex-analyzer-r560135-patch.txt, jflex-analyzer-r561292-patch.txt, jflex-analyzer-r561693-compatibility.txt, jflex-analyzer-r562378-patch-nodup.txt, jflex-analyzer-r562378-patch.txt
>
>
> JFlex (http://www.jflex.de/) can be used to generate a faster (up to several times) replacement for StandardAnalyzer. Will add a patch and a simple benchmark code in a while.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org