You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by Thomas Woodard <tw...@eline.com> on 2022/06/07 14:22:28 UTC

"this.stopWords" is null

I had an 8.11.1 implementation in progress when 9.0 came out, and am trying
to convert it so we don't go live on an already outdated version. I'm
having trouble adding documents to the index that worked fine with 8.11.1.
Shortened error is below:

2022-06-07 13:49:24.190 ERROR (qtp554868511-21) [ x:sku]
o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Exception
writing document id 6-1-TB-0701 to the index; possible analysis error. =>
org.apache.solr.common.SolrException: Exception writing document id
6-1-TB-0701 to the index; possible analysis error.
Caused by: java.lang.NullPointerException: Cannot invoke
"org.apache.lucene.analysis.CharArraySet.contains(char[], int, int)"
because "this.stopWords" is null
        at org.apache.lucene.analysis.StopFilter.accept(StopFilter.java:97)
~[?:?]
        at
org.apache.lucene.analysis.FilteringTokenFilter.incrementToken(FilteringTokenFilter.java:52)
~[?:?]
        at
org.apache.lucene.analysis.LowerCaseFilter.incrementToken(LowerCaseFilter.java:37)
~[?:?]
        at
org.apache.lucene.index.IndexingChain$PerField.invert(IndexingChain.java:1142)
~[?:?]
        at
org.apache.lucene.index.IndexingChain.processField(IndexingChain.java:729)
~[?:?]
        at
org.apache.lucene.index.IndexingChain.processDocument(IndexingChain.java:620)
~[?:?]
        at
org.apache.lucene.index.DocumentsWriterPerThread.updateDocuments(DocumentsWriterPerThread.java:239)
~[?:?]
        at
org.apache.lucene.index.DocumentsWriter.updateDocuments(DocumentsWriter.java:432)
~[?:?]
        at
org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1530)
~[?:?]
        at
org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1519)
~[?:?]
        at
org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues(DirectUpdateHandler2.java:1046)
~[?:?]
        at
org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate(DirectUpdateHandler2.java:416)
~[?:?]
        at
org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:369)
~[?:?]
        at
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:300)
~[?:?]
        ... 80 more

I have double checked all the stop filters in my schema.xml in my
configset, and they all seem fine. The import should only be using
text_general, which is configured like this:
   <fieldType name="text_general" class="solr.TextField"
positionIncrementGap="100" multiValued="true">
      <analyzer type="index">
        <tokenizer name="standard"/>
        <filter name="stop" ignoreCase="true" words="./stopwords.txt" />
        <!-- in this example, we will only use synonyms at query time
        <filter name="synonymGraph" synonyms="index_synonyms.txt"
ignoreCase="true" expand="false"/>
        <filter name="flattenGraph"/>
        -->
        <filter name="lowercase"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer name="standard"/>
        <filter name="stop" ignoreCase="true" words="./stopwords.txt" />
        <filter name="synonymGraph" synonyms="./synonyms.txt"
ignoreCase="true" expand="true"/>
        <filter name="lowercase"/>
      </analyzer>
    </fieldType>

I can't figure out what the problem is, or how to do more detailed
debugging to find it. Any help would be greatly appreciated.

Re: "this.stopWords" is null

Posted by Thomas Woodard <tw...@eline.com>.
Yes, that exactly matched my situation. Thanks for the info, though for now
I've just gone back to 8.11.1 to keep the project moving.

On Wed, Jun 8, 2022 at 5:34 PM Chris Hostetter <ho...@fucit.org>
wrote:

>
> I suspect you are hitting this bug...
>
> https://issues.apache.org/jira/browse/SOLR-16203
>
> ...but AFAIK that would only happen if you are are  explicitly using
> ClassicIndexSchemaFactory in your solrconfig.xml ... can you confirm?
>
> Assuming I'm correct, then either switching to ManagedIndexSchemaFactory
> (and renaming your schema.xml accordingly, or letting it rename
> automatically on startup) *OR* switching all your factory decalrations to
> use the "solr.ClassName" syntax should make the problem go away.
>
> If it does not, that's very curiuos -- and a fully copy of your entire
> configset would probably be needed to give you additional advice.
>
>
>
> : Date: Tue, 7 Jun 2022 10:59:26 -0500
> : From: Thomas Woodard <tw...@eline.com>
> : Reply-To: users@solr.apache.org
> : To: users@solr.apache.org
> : Subject: Re: "this.stopWords" is null
> :
> : Commenting out the stop filter allowed documents to be indexed,
> confirming
> : it was actually the problem. But then queries fail because of not being
> : able to find the synonyms for what looks like a similar reason.
> :
> : I've also tried switching the files to use absolute paths like below, but
> : that also does not work:
> :         <filter name="stop" ignoreCase="true"
> : words="/var/solr/data/configsets/common/conf/stopwords.txt" />
> :
> : It certainly seems like the Solr configuration is simply not initializing
> : the Lucene filters correctly.
> :
> : On Tue, Jun 7, 2022 at 9:22 AM Thomas Woodard <tw...@eline.com>
> wrote:
> :
> : > I had an 8.11.1 implementation in progress when 9.0 came out, and am
> : > trying to convert it so we don't go live on an already outdated
> version.
> : > I'm having trouble adding documents to the index that worked fine with
> : > 8.11.1. Shortened error is below:
> : >
> : > 2022-06-07 13:49:24.190 ERROR (qtp554868511-21) [ x:sku]
> : > o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException:
> Exception
> : > writing document id 6-1-TB-0701 to the index; possible analysis error.
> =>
> : > org.apache.solr.common.SolrException: Exception writing document id
> : > 6-1-TB-0701 to the index; possible analysis error.
> : > Caused by: java.lang.NullPointerException: Cannot invoke
> : > "org.apache.lucene.analysis.CharArraySet.contains(char[], int, int)"
> : > because "this.stopWords" is null
> : >         at
> : > org.apache.lucene.analysis.StopFilter.accept(StopFilter.java:97) ~[?:?]
> : >         at
> : >
> org.apache.lucene.analysis.FilteringTokenFilter.incrementToken(FilteringTokenFilter.java:52)
> : > ~[?:?]
> : >         at
> : >
> org.apache.lucene.analysis.LowerCaseFilter.incrementToken(LowerCaseFilter.java:37)
> : > ~[?:?]
> : >         at
> : >
> org.apache.lucene.index.IndexingChain$PerField.invert(IndexingChain.java:1142)
> : > ~[?:?]
> : >         at
> : >
> org.apache.lucene.index.IndexingChain.processField(IndexingChain.java:729)
> : > ~[?:?]
> : >         at
> : >
> org.apache.lucene.index.IndexingChain.processDocument(IndexingChain.java:620)
> : > ~[?:?]
> : >         at
> : >
> org.apache.lucene.index.DocumentsWriterPerThread.updateDocuments(DocumentsWriterPerThread.java:239)
> : > ~[?:?]
> : >         at
> : >
> org.apache.lucene.index.DocumentsWriter.updateDocuments(DocumentsWriter.java:432)
> : > ~[?:?]
> : >         at
> : >
> org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1530)
> : > ~[?:?]
> : >         at
> : >
> org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1519)
> : > ~[?:?]
> : >         at
> : >
> org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues(DirectUpdateHandler2.java:1046)
> : > ~[?:?]
> : >         at
> : >
> org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate(DirectUpdateHandler2.java:416)
> : > ~[?:?]
> : >         at
> : >
> org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:369)
> : > ~[?:?]
> : >         at
> : >
> org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:300)
> : > ~[?:?]
> : >         ... 80 more
> : >
> : > I have double checked all the stop filters in my schema.xml in my
> : > configset, and they all seem fine. The import should only be using
> : > text_general, which is configured like this:
> : >    <fieldType name="text_general" class="solr.TextField"
> : > positionIncrementGap="100" multiValued="true">
> : >       <analyzer type="index">
> : >         <tokenizer name="standard"/>
> : >         <filter name="stop" ignoreCase="true" words="./stopwords.txt"
> />
> : >         <!-- in this example, we will only use synonyms at query time
> : >         <filter name="synonymGraph" synonyms="index_synonyms.txt"
> : > ignoreCase="true" expand="false"/>
> : >         <filter name="flattenGraph"/>
> : >         -->
> : >         <filter name="lowercase"/>
> : >       </analyzer>
> : >       <analyzer type="query">
> : >         <tokenizer name="standard"/>
> : >         <filter name="stop" ignoreCase="true" words="./stopwords.txt"
> />
> : >         <filter name="synonymGraph" synonyms="./synonyms.txt"
> : > ignoreCase="true" expand="true"/>
> : >         <filter name="lowercase"/>
> : >       </analyzer>
> : >     </fieldType>
> : >
> : > I can't figure out what the problem is, or how to do more detailed
> : > debugging to find it. Any help would be greatly appreciated.
> : >
> : >
> :
>
> -Hoss
> http://www.lucidworks.com/
>

Re: "this.stopWords" is null

Posted by Chris Hostetter <ho...@fucit.org>.
I suspect you are hitting this bug...

https://issues.apache.org/jira/browse/SOLR-16203

...but AFAIK that would only happen if you are are  explicitly using 
ClassicIndexSchemaFactory in your solrconfig.xml ... can you confirm?

Assuming I'm correct, then either switching to ManagedIndexSchemaFactory 
(and renaming your schema.xml accordingly, or letting it rename 
automatically on startup) *OR* switching all your factory decalrations to 
use the "solr.ClassName" syntax should make the problem go away.

If it does not, that's very curiuos -- and a fully copy of your entire 
configset would probably be needed to give you additional advice.



: Date: Tue, 7 Jun 2022 10:59:26 -0500
: From: Thomas Woodard <tw...@eline.com>
: Reply-To: users@solr.apache.org
: To: users@solr.apache.org
: Subject: Re: "this.stopWords" is null
: 
: Commenting out the stop filter allowed documents to be indexed, confirming
: it was actually the problem. But then queries fail because of not being
: able to find the synonyms for what looks like a similar reason.
: 
: I've also tried switching the files to use absolute paths like below, but
: that also does not work:
:         <filter name="stop" ignoreCase="true"
: words="/var/solr/data/configsets/common/conf/stopwords.txt" />
: 
: It certainly seems like the Solr configuration is simply not initializing
: the Lucene filters correctly.
: 
: On Tue, Jun 7, 2022 at 9:22 AM Thomas Woodard <tw...@eline.com> wrote:
: 
: > I had an 8.11.1 implementation in progress when 9.0 came out, and am
: > trying to convert it so we don't go live on an already outdated version.
: > I'm having trouble adding documents to the index that worked fine with
: > 8.11.1. Shortened error is below:
: >
: > 2022-06-07 13:49:24.190 ERROR (qtp554868511-21) [ x:sku]
: > o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Exception
: > writing document id 6-1-TB-0701 to the index; possible analysis error. =>
: > org.apache.solr.common.SolrException: Exception writing document id
: > 6-1-TB-0701 to the index; possible analysis error.
: > Caused by: java.lang.NullPointerException: Cannot invoke
: > "org.apache.lucene.analysis.CharArraySet.contains(char[], int, int)"
: > because "this.stopWords" is null
: >         at
: > org.apache.lucene.analysis.StopFilter.accept(StopFilter.java:97) ~[?:?]
: >         at
: > org.apache.lucene.analysis.FilteringTokenFilter.incrementToken(FilteringTokenFilter.java:52)
: > ~[?:?]
: >         at
: > org.apache.lucene.analysis.LowerCaseFilter.incrementToken(LowerCaseFilter.java:37)
: > ~[?:?]
: >         at
: > org.apache.lucene.index.IndexingChain$PerField.invert(IndexingChain.java:1142)
: > ~[?:?]
: >         at
: > org.apache.lucene.index.IndexingChain.processField(IndexingChain.java:729)
: > ~[?:?]
: >         at
: > org.apache.lucene.index.IndexingChain.processDocument(IndexingChain.java:620)
: > ~[?:?]
: >         at
: > org.apache.lucene.index.DocumentsWriterPerThread.updateDocuments(DocumentsWriterPerThread.java:239)
: > ~[?:?]
: >         at
: > org.apache.lucene.index.DocumentsWriter.updateDocuments(DocumentsWriter.java:432)
: > ~[?:?]
: >         at
: > org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1530)
: > ~[?:?]
: >         at
: > org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1519)
: > ~[?:?]
: >         at
: > org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues(DirectUpdateHandler2.java:1046)
: > ~[?:?]
: >         at
: > org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate(DirectUpdateHandler2.java:416)
: > ~[?:?]
: >         at
: > org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:369)
: > ~[?:?]
: >         at
: > org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:300)
: > ~[?:?]
: >         ... 80 more
: >
: > I have double checked all the stop filters in my schema.xml in my
: > configset, and they all seem fine. The import should only be using
: > text_general, which is configured like this:
: >    <fieldType name="text_general" class="solr.TextField"
: > positionIncrementGap="100" multiValued="true">
: >       <analyzer type="index">
: >         <tokenizer name="standard"/>
: >         <filter name="stop" ignoreCase="true" words="./stopwords.txt" />
: >         <!-- in this example, we will only use synonyms at query time
: >         <filter name="synonymGraph" synonyms="index_synonyms.txt"
: > ignoreCase="true" expand="false"/>
: >         <filter name="flattenGraph"/>
: >         -->
: >         <filter name="lowercase"/>
: >       </analyzer>
: >       <analyzer type="query">
: >         <tokenizer name="standard"/>
: >         <filter name="stop" ignoreCase="true" words="./stopwords.txt" />
: >         <filter name="synonymGraph" synonyms="./synonyms.txt"
: > ignoreCase="true" expand="true"/>
: >         <filter name="lowercase"/>
: >       </analyzer>
: >     </fieldType>
: >
: > I can't figure out what the problem is, or how to do more detailed
: > debugging to find it. Any help would be greatly appreciated.
: >
: >
: 

-Hoss
http://www.lucidworks.com/

Re: "this.stopWords" is null

Posted by Thomas Woodard <tw...@eline.com>.
Commenting out the stop filter allowed documents to be indexed, confirming
it was actually the problem. But then queries fail because of not being
able to find the synonyms for what looks like a similar reason.

I've also tried switching the files to use absolute paths like below, but
that also does not work:
        <filter name="stop" ignoreCase="true"
words="/var/solr/data/configsets/common/conf/stopwords.txt" />

It certainly seems like the Solr configuration is simply not initializing
the Lucene filters correctly.

On Tue, Jun 7, 2022 at 9:22 AM Thomas Woodard <tw...@eline.com> wrote:

> I had an 8.11.1 implementation in progress when 9.0 came out, and am
> trying to convert it so we don't go live on an already outdated version.
> I'm having trouble adding documents to the index that worked fine with
> 8.11.1. Shortened error is below:
>
> 2022-06-07 13:49:24.190 ERROR (qtp554868511-21) [ x:sku]
> o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Exception
> writing document id 6-1-TB-0701 to the index; possible analysis error. =>
> org.apache.solr.common.SolrException: Exception writing document id
> 6-1-TB-0701 to the index; possible analysis error.
> Caused by: java.lang.NullPointerException: Cannot invoke
> "org.apache.lucene.analysis.CharArraySet.contains(char[], int, int)"
> because "this.stopWords" is null
>         at
> org.apache.lucene.analysis.StopFilter.accept(StopFilter.java:97) ~[?:?]
>         at
> org.apache.lucene.analysis.FilteringTokenFilter.incrementToken(FilteringTokenFilter.java:52)
> ~[?:?]
>         at
> org.apache.lucene.analysis.LowerCaseFilter.incrementToken(LowerCaseFilter.java:37)
> ~[?:?]
>         at
> org.apache.lucene.index.IndexingChain$PerField.invert(IndexingChain.java:1142)
> ~[?:?]
>         at
> org.apache.lucene.index.IndexingChain.processField(IndexingChain.java:729)
> ~[?:?]
>         at
> org.apache.lucene.index.IndexingChain.processDocument(IndexingChain.java:620)
> ~[?:?]
>         at
> org.apache.lucene.index.DocumentsWriterPerThread.updateDocuments(DocumentsWriterPerThread.java:239)
> ~[?:?]
>         at
> org.apache.lucene.index.DocumentsWriter.updateDocuments(DocumentsWriter.java:432)
> ~[?:?]
>         at
> org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1530)
> ~[?:?]
>         at
> org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1519)
> ~[?:?]
>         at
> org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues(DirectUpdateHandler2.java:1046)
> ~[?:?]
>         at
> org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate(DirectUpdateHandler2.java:416)
> ~[?:?]
>         at
> org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:369)
> ~[?:?]
>         at
> org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:300)
> ~[?:?]
>         ... 80 more
>
> I have double checked all the stop filters in my schema.xml in my
> configset, and they all seem fine. The import should only be using
> text_general, which is configured like this:
>    <fieldType name="text_general" class="solr.TextField"
> positionIncrementGap="100" multiValued="true">
>       <analyzer type="index">
>         <tokenizer name="standard"/>
>         <filter name="stop" ignoreCase="true" words="./stopwords.txt" />
>         <!-- in this example, we will only use synonyms at query time
>         <filter name="synonymGraph" synonyms="index_synonyms.txt"
> ignoreCase="true" expand="false"/>
>         <filter name="flattenGraph"/>
>         -->
>         <filter name="lowercase"/>
>       </analyzer>
>       <analyzer type="query">
>         <tokenizer name="standard"/>
>         <filter name="stop" ignoreCase="true" words="./stopwords.txt" />
>         <filter name="synonymGraph" synonyms="./synonyms.txt"
> ignoreCase="true" expand="true"/>
>         <filter name="lowercase"/>
>       </analyzer>
>     </fieldType>
>
> I can't figure out what the problem is, or how to do more detailed
> debugging to find it. Any help would be greatly appreciated.
>
>