You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Benson Margulies <be...@basistech.com> on 2014/01/15 22:52:12 UTC

Analyzers versus Tokenizers/TokenFilters

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters never
mentions an Analyzer class.

http://wiki.apache.org/solr/SolrPlugins talks about subclasses of
SolrAnalyzer as ways of delivering an entire analysis chain and still
'minding the gap'.

Anyone care to offer a comparison of the viewpoints?

Re: Analyzers versus Tokenizers/TokenFilters

Posted by Ahmet Arslan <io...@yahoo.com>.
Plus, admin analysis page displays nicely intermediate tokens produced by each component. Very nice feature I think. If you plug lucene analyzer, you won't be able to see intermediate results.

Ahmet  



On Thursday, January 16, 2014 5:59 AM, Otis Gospodnetic <ot...@gmail.com> wrote:
But the latter gives users the flexibility of putting together any
T+F1....FN chains they want and easily adding their own custom Fx to the
mix.

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/



On Wed, Jan 15, 2014 at 9:45 PM, Benson Margulies <bi...@gmail.com>wrote:

> Ahmet,
>
> So, this is an interesting difference between Lucene (and ES) and
> Solr. In Lucene, the idea seems to be that you package up a reusable
> analysis chain as an analyzer. Saying 'use analyzer X' is less complex
> than saying 'use tokenizer T and filters F1, F2, ...'.
>
> thanks,
> benson
>
>
> On Wed, Jan 15, 2014 at 5:09 PM, Ahmet Arslan <io...@yahoo.com> wrote:
> > Hi Benson,
> >
> > Using lucene analyzer in schema.xlm should be last resort. For very
> specific reasons : if you have an existing analyzer, etc.
> >
> > Ahmet
> >
> >
> > On Wednesday, January 15, 2014 11:52 PM, Benson Margulies <
> benson@basistech.com> wrote:
> > http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters never
> > mentions an Analyzer class.
> >
> > http://wiki.apache.org/solr/SolrPlugins talks about subclasses of
> > SolrAnalyzer as ways of delivering an entire analysis chain and still
> > 'minding the gap'.
> >
> > Anyone care to offer a comparison of the viewpoints?
> >
>


Re: Analyzers versus Tokenizers/TokenFilters

Posted by Otis Gospodnetic <ot...@gmail.com>.
But the latter gives users the flexibility of putting together any
T+F1....FN chains they want and easily adding their own custom Fx to the
mix.

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


On Wed, Jan 15, 2014 at 9:45 PM, Benson Margulies <bi...@gmail.com>wrote:

> Ahmet,
>
> So, this is an interesting difference between Lucene (and ES) and
> Solr. In Lucene, the idea seems to be that you package up a reusable
> analysis chain as an analyzer. Saying 'use analyzer X' is less complex
> than saying 'use tokenizer T and filters F1, F2, ...'.
>
> thanks,
> benson
>
>
> On Wed, Jan 15, 2014 at 5:09 PM, Ahmet Arslan <io...@yahoo.com> wrote:
> > Hi Benson,
> >
> > Using lucene analyzer in schema.xlm should be last resort. For very
> specific reasons : if you have an existing analyzer, etc.
> >
> > Ahmet
> >
> >
> > On Wednesday, January 15, 2014 11:52 PM, Benson Margulies <
> benson@basistech.com> wrote:
> > http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters never
> > mentions an Analyzer class.
> >
> > http://wiki.apache.org/solr/SolrPlugins talks about subclasses of
> > SolrAnalyzer as ways of delivering an entire analysis chain and still
> > 'minding the gap'.
> >
> > Anyone care to offer a comparison of the viewpoints?
> >
>

Re: Analyzers versus Tokenizers/TokenFilters

Posted by Benson Margulies <bi...@gmail.com>.
Ahmet,

So, this is an interesting difference between Lucene (and ES) and
Solr. In Lucene, the idea seems to be that you package up a reusable
analysis chain as an analyzer. Saying 'use analyzer X' is less complex
than saying 'use tokenizer T and filters F1, F2, ...'.

thanks,
benson


On Wed, Jan 15, 2014 at 5:09 PM, Ahmet Arslan <io...@yahoo.com> wrote:
> Hi Benson,
>
> Using lucene analyzer in schema.xlm should be last resort. For very specific reasons : if you have an existing analyzer, etc.
>
> Ahmet
>
>
> On Wednesday, January 15, 2014 11:52 PM, Benson Margulies <be...@basistech.com> wrote:
> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters never
> mentions an Analyzer class.
>
> http://wiki.apache.org/solr/SolrPlugins talks about subclasses of
> SolrAnalyzer as ways of delivering an entire analysis chain and still
> 'minding the gap'.
>
> Anyone care to offer a comparison of the viewpoints?
>

Re: Analyzers versus Tokenizers/TokenFilters

Posted by Ahmet Arslan <io...@yahoo.com>.
Hi Benson,

Using lucene analyzer in schema.xlm should be last resort. For very specific reasons : if you have an existing analyzer, etc.

Ahmet


On Wednesday, January 15, 2014 11:52 PM, Benson Margulies <be...@basistech.com> wrote:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters never
mentions an Analyzer class.

http://wiki.apache.org/solr/SolrPlugins talks about subclasses of
SolrAnalyzer as ways of delivering an entire analysis chain and still
'minding the gap'.

Anyone care to offer a comparison of the viewpoints?