You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Terry Smith <sh...@gmail.com> on 2016/07/14 17:26:28 UTC

CustomAnalyzer and AttributeFactories

I've hit a runtime issue when consuming the nightly 7.0.0-SNAPSHOT maven
build and was wondering if someone could shed some light on it.

Some custom code is causing the following exception:

java.lang.IllegalArgumentException: State contains AttributeImpl of type
org.apache.lucene.analysis.tokenattributes.CharTermAttributeImpl that is
not in in this AttributeSource

Which seems to be related to the change made by
https://issues.apache.org/jira/browse/LUCENE-7355 which changes
CustomAnalyzer like so:

   protected TokenStreamComponents createComponents(String fieldName) {

-    final Tokenizer tk = tokenizer.create();

+    final Tokenizer tk = tokenizer.create(attributeFactory());

I'm trying to untangle the attribute factory logic and have so far figured
out that CustomAnalyzer is now using AttributeFactory.DEFAULT_ATTRIBUTE_FACTORY
whereas it used to use TokenStream.DEFAULT_TOKEN_ATTRIBUTE_FACTORY.

The old default would use a PackedTokenAttributeImpl where as the new
default seems to create a class for each of the common attributes.

Unfortunately, CustomAnalyzer is final (with a default protected
constructor) making it impossible to extend and override this
attributeFactrory method.

I think I should fix this by making CustomAnalyzer use the previous default
attribute factory that uses PackedTokenAttributeImpl but am unsure how to
achieve that.

Am I understanding my problem and ideal solution correctly here? If so,
should I submit a patch to open up CustomAnalyzer to achieve this goal?

--Terry

Re: CustomAnalyzer and AttributeFactories

Posted by Terry Smith <sh...@gmail.com>.
Uwe,

Thanks! I've created LUCENE-7382
<https://issues.apache.org/jira/browse/LUCENE-7382> for this issue.

--Terry


On Thu, Jul 14, 2016 at 3:54 PM, Uwe Schindler <uw...@thetaphi.de> wrote:

> Can you open issue? This is a bug because the wrong default is used.
>
> Uwe
>
> Am 14. Juli 2016 19:26:28 MESZ, schrieb Terry Smith <sh...@gmail.com>:
> >I've hit a runtime issue when consuming the nightly 7.0.0-SNAPSHOT
> >maven
> >build and was wondering if someone could shed some light on it.
> >
> >Some custom code is causing the following exception:
> >
> >java.lang.IllegalArgumentException: State contains AttributeImpl of
> >type
> >org.apache.lucene.analysis.tokenattributes.CharTermAttributeImpl that
> >is
> >not in in this AttributeSource
> >
> >Which seems to be related to the change made by
> >https://issues.apache.org/jira/browse/LUCENE-7355 which changes
> >CustomAnalyzer like so:
> >
> >   protected TokenStreamComponents createComponents(String fieldName) {
> >
> >-    final Tokenizer tk = tokenizer.create();
> >
> >+    final Tokenizer tk = tokenizer.create(attributeFactory());
> >
> >I'm trying to untangle the attribute factory logic and have so far
> >figured
> >out that CustomAnalyzer is now using
> >AttributeFactory.DEFAULT_ATTRIBUTE_FACTORY
> >whereas it used to use TokenStream.DEFAULT_TOKEN_ATTRIBUTE_FACTORY.
> >
> >The old default would use a PackedTokenAttributeImpl where as the new
> >default seems to create a class for each of the common attributes.
> >
> >Unfortunately, CustomAnalyzer is final (with a default protected
> >constructor) making it impossible to extend and override this
> >attributeFactrory method.
> >
> >I think I should fix this by making CustomAnalyzer use the previous
> >default
> >attribute factory that uses PackedTokenAttributeImpl but am unsure how
> >to
> >achieve that.
> >
> >Am I understanding my problem and ideal solution correctly here? If so,
> >should I submit a patch to open up CustomAnalyzer to achieve this goal?
> >
> >--Terry
>
> --
> Uwe Schindler
> H.-H.-Meier-Allee 63, 28213 Bremen
> http://www.thetaphi.de
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: CustomAnalyzer and AttributeFactories

Posted by Uwe Schindler <uw...@thetaphi.de>.
Can you open issue? This is a bug because the wrong default is used.

Uwe

Am 14. Juli 2016 19:26:28 MESZ, schrieb Terry Smith <sh...@gmail.com>:
>I've hit a runtime issue when consuming the nightly 7.0.0-SNAPSHOT
>maven
>build and was wondering if someone could shed some light on it.
>
>Some custom code is causing the following exception:
>
>java.lang.IllegalArgumentException: State contains AttributeImpl of
>type
>org.apache.lucene.analysis.tokenattributes.CharTermAttributeImpl that
>is
>not in in this AttributeSource
>
>Which seems to be related to the change made by
>https://issues.apache.org/jira/browse/LUCENE-7355 which changes
>CustomAnalyzer like so:
>
>   protected TokenStreamComponents createComponents(String fieldName) {
>
>-    final Tokenizer tk = tokenizer.create();
>
>+    final Tokenizer tk = tokenizer.create(attributeFactory());
>
>I'm trying to untangle the attribute factory logic and have so far
>figured
>out that CustomAnalyzer is now using
>AttributeFactory.DEFAULT_ATTRIBUTE_FACTORY
>whereas it used to use TokenStream.DEFAULT_TOKEN_ATTRIBUTE_FACTORY.
>
>The old default would use a PackedTokenAttributeImpl where as the new
>default seems to create a class for each of the common attributes.
>
>Unfortunately, CustomAnalyzer is final (with a default protected
>constructor) making it impossible to extend and override this
>attributeFactrory method.
>
>I think I should fix this by making CustomAnalyzer use the previous
>default
>attribute factory that uses PackedTokenAttributeImpl but am unsure how
>to
>achieve that.
>
>Am I understanding my problem and ideal solution correctly here? If so,
>should I submit a patch to open up CustomAnalyzer to achieve this goal?
>
>--Terry

--
Uwe Schindler
H.-H.-Meier-Allee 63, 28213 Bremen
http://www.thetaphi.de

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org