You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Phan The Dai <th...@gmail.com> on 2010/02/07 16:36:38 UTC

Recommend a example to implement an analyzer with parsing Camelcase

Would you like to suggest me an example for implementing an analyzer with
parsing CamelCase !

I can overload methods with StopFilter PorterStemFilter, LowerCaseTokenizer
but with a new one different from these available filter I have not
solutions.
Thank you !

Re: Recommend a example to implement an analyzer with parsing Camelcase

Posted by Phan The Dai <th...@gmail.com>.
They are more details.
Thank you very much !

On Mon, Feb 8, 2010 at 1:37 AM, Ahmet Arslan <io...@yahoo.com> wrote:

>
> > Hi Ahmet,
> > I have ever known WordDelimiterFilterFactory, but never use
> > Solr.
> > But how to download this class.
>
> http://repo1.maven.org/maven2/org/apache/solr/solr-core/1.4.0/
>
> > Can I use it in Lucene 3.0, or extends Analyzer with
> > overloading its
> > methods.
>
> It is not using new token stream API yet, but you can use it.
> WordDelimiterFilter is package-private but you can use its factory as
> follows:
>
> Map<String, String> delimeterArgs = new HashMap<String, String>(9);
>
>  delimeterArgs.put("generateWordParts", "1");
>  delimeterArgs.put("generateNumberParts", "0");
>  delimeterArgs.put("catenateWords", "0");
>  delimeterArgs.put("catenateNumbers", "0");
>  delimeterArgs.put("catenateAll", "0");
>  delimeterArgs.put("splitOnCaseChange", "0");
>  delimeterArgs.put("splitOnNumerics", "1");
>  delimeterArgs.put("preserveOriginal", "1");
>  delimeterArgs.put("stemEnglishPossessive", "0");
>
> WordDelimiterFilterFactory wordDelimiterFactory = new
> WordDelimiterFilterFactory();
>
> wordDelimiterFactory.init(delimeterArgs);
>
> You can appned it to your analyzer chain:
>
> result = wordDelimiterFactory.create(result);
>
> Explanations of parameters are explained in the wiki.
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Recommend a example to implement an analyzer with parsing Camelcase

Posted by Ahmet Arslan <io...@yahoo.com>.
> Hi Ahmet,
> I have ever known WordDelimiterFilterFactory, but never use
> Solr.
> But how to download this class.

http://repo1.maven.org/maven2/org/apache/solr/solr-core/1.4.0/

> Can I use it in Lucene 3.0, or extends Analyzer with
> overloading its
> methods.

It is not using new token stream API yet, but you can use it. WordDelimiterFilter is package-private but you can use its factory as follows: 

Map<String, String> delimeterArgs = new HashMap<String, String>(9);

 delimeterArgs.put("generateWordParts", "1");
 delimeterArgs.put("generateNumberParts", "0");
 delimeterArgs.put("catenateWords", "0");
 delimeterArgs.put("catenateNumbers", "0");
 delimeterArgs.put("catenateAll", "0");
 delimeterArgs.put("splitOnCaseChange", "0");
 delimeterArgs.put("splitOnNumerics", "1");
 delimeterArgs.put("preserveOriginal", "1");
 delimeterArgs.put("stemEnglishPossessive", "0");

WordDelimiterFilterFactory wordDelimiterFactory = new WordDelimiterFilterFactory();

wordDelimiterFactory.init(delimeterArgs);

You can appned it to your analyzer chain:

result = wordDelimiterFactory.create(result);

Explanations of parameters are explained in the wiki.


      


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Recommend a example to implement an analyzer with parsing Camelcase

Posted by Phan The Dai <th...@gmail.com>.
Hi Ahmet,
I have ever known WordDelimiterFilterFactory, but never use Solr.
But how to download this class.
Can I use it in Lucene 3.0, or extends Analyzer with overloading its
methods.
Sorry If my questions are too details.


On Mon, Feb 8, 2010 at 1:11 AM, Ahmet Arslan <io...@yahoo.com> wrote:

> > Would you like to suggest me an
> > example for implementing an analyzer with
> > parsing CamelCase !
> >
> > I can overload methods with StopFilter PorterStemFilter,
> > LowerCaseTokenizer
> > but with a new one different from these available filter I
> > have not
> > solutions.
> > Thank you !
>
> You can use WordDelimiterFilterFactory[1] with splitOnCaseChange="1"
>
> [1]
> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.WordDelimiterFilterFactory
>
> You need to consume it from solr artifacts.
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Recommend a example to implement an analyzer with parsing Camelcase

Posted by Ahmet Arslan <io...@yahoo.com>.
> Would you like to suggest me an
> example for implementing an analyzer with
> parsing CamelCase !
> 
> I can overload methods with StopFilter PorterStemFilter,
> LowerCaseTokenizer
> but with a new one different from these available filter I
> have not
> solutions.
> Thank you !

You can use WordDelimiterFilterFactory[1] with splitOnCaseChange="1" 

[1]http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.WordDelimiterFilterFactory

You need to consume it from solr artifacts.


      

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org