You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Thomas Corthals <th...@klascement.net> on 2020/02/17 15:31:28 UTC

Re-creating deleted Managed Stopwords lists results in error

Hi

I've run into an issue with creating a Managed Stopwords list that has the
same name as a previously deleted list. Going through the same flow with
Managed Synonyms doesn't result in this unexpected behaviour. Am I missing
something or did I discover a bug in Solr?

On a newly started solr with the techproducts core:

curl -X PUT -H 'Content-type:application/json' --data-binary
'{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
curl -X DELETE
http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
curl http://localhost:8983/solr/admin/cores?action=RELOAD\&core=techproducts
curl -X PUT -H 'Content-type:application/json' --data-binary
'{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist

The second PUT request results in a status 500 with error
msg "java.util.LinkedHashMap cannot be cast to java.util.List".

Similar requests for synonyms work fine, no matter how many times I repeat
the CREATE/DELETE/RELOAD cycle:

curl -X PUT -H 'Content-type:application/json' --data-binary
'{"class":"org.apache.solr.rest.schema.analysis.ManagedSynonymGraphFilterFactory$SynonymManager"}'
http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
curl -X DELETE
http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
curl http://localhost:8983/solr/admin/cores?action=RELOAD\&core=techproducts
curl -X PUT -H 'Content-type:application/json' --data-binary
'{"class":"org.apache.solr.rest.schema.analysis.ManagedSynonymGraphFilterFactory$SynonymManager"}'
http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap

Reloading after creating the Stopwords list but not after deleting it works
without error too on a fresh techproducts core (you'll have to remove the
directory from disk and create the core again after running the previous
commands).

curl -X PUT -H 'Content-type:application/json' --data-binary
'{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
curl http://localhost:8983/solr/admin/cores?action=RELOAD\&core=techproducts
curl -X DELETE
http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
curl -X PUT -H 'Content-type:application/json' --data-binary
'{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist

And even curiouser, when doing a CREATE/DELETE for Stopwords, then a
CREATE/DELETE for Synonyms, and only then a RELOAD of the core, the cycle
can be completed twice. (Again, on a freshly created techproducts core.)
Only the third attempt to create a list results in an error. Synonyms can
still be created and deleted repeatedly after this.

curl -X PUT -H 'Content-type:application/json' --data-binary
'{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
curl -X DELETE
http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
curl -X PUT -H 'Content-type:application/json' --data-binary
'{"class":"org.apache.solr.rest.schema.analysis.ManagedSynonymGraphFilterFactory$SynonymManager"}'
http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
curl -X DELETE
http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
curl http://localhost:8983/solr/admin/cores?action=RELOAD\&core=techproducts
curl -X PUT -H 'Content-type:application/json' --data-binary
'{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
curl -X DELETE
http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
curl -X PUT -H 'Content-type:application/json' --data-binary
'{"class":"org.apache.solr.rest.schema.analysis.ManagedSynonymGraphFilterFactory$SynonymManager"}'
http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
curl -X DELETE
http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
curl http://localhost:8983/solr/admin/cores?action=RELOAD\&core=techproducts
curl -X PUT -H 'Content-type:application/json' --data-binary
'{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist

The same successes/errors occur when running each cycle against a different
core if the cores share the same configset.

Any ideas on what might be going wrong?

Re: Re-creating deleted Managed Stopwords lists results in error

Posted by Walter Underwood <wu...@wunderwood.org>.
Make phrases into single tokens at indexing and query time. Let the engine do
the rest of the work.

For example, “subunits of the army” can become “subunitsofthearmy” or “subunits_of_the_army”.
We used patterns to choose phrases, so “word word”, “word glue word”, or “word glue glue word”
could become phrases.

Nutch did something like this, but used it for filtering down the candidates for matching,
then used regular Lucene scoring for ranking.

The Infoseek Ultra index used these phrase terms but did not store positions.

The idea came from early DNA search engines.

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Feb 17, 2020, at 10:53 AM, David Hastings <ha...@gmail.com> wrote:
> 
> interesting, i cant seem to find anything on Phrase IDF, dont suppose you
> have a link or two i could look at by chance?
> 
> On Mon, Feb 17, 2020 at 1:48 PM Walter Underwood <wu...@wunderwood.org>
> wrote:
> 
>> At Infoseek, we used “glue words” to build phrase tokens. It was really
>> effective.
>> Phrase IDF is powerful stuff.
>> 
>> Luckily for you, the patent on that has expired. :-)
>> 
>> wunder
>> Walter Underwood
>> wunder@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>> 
>>> On Feb 17, 2020, at 10:46 AM, David Hastings <
>> hastings.recursive@gmail.com> wrote:
>>> 
>>> i use stop words for building shingles into "interesting phrases" for my
>>> machine teacher/students, so i wouldnt say theres no reason, however my
>> use
>>> case is very specific.  Otherwise yeah, theyre gone for all practical
>>> reasons/search scenarios.
>>> 
>>> On Mon, Feb 17, 2020 at 1:41 PM Walter Underwood <wu...@wunderwood.org>
>>> wrote:
>>> 
>>>> Why are you using stopwords? I would need a really, really good reason
>> to
>>>> use those.
>>>> 
>>>> Stopwords are an obsolete technique from 16-bit processors. I’ve never
>>>> used them and
>>>> I’ve been a search engineer since 1997.
>>>> 
>>>> wunder
>>>> Walter Underwood
>>>> wunder@wunderwood.org
>>>> http://observer.wunderwood.org/  (my blog)
>>>> 
>>>>> On Feb 17, 2020, at 7:31 AM, Thomas Corthals <th...@klascement.net>
>>>> wrote:
>>>>> 
>>>>> Hi
>>>>> 
>>>>> I've run into an issue with creating a Managed Stopwords list that has
>>>> the
>>>>> same name as a previously deleted list. Going through the same flow
>> with
>>>>> Managed Synonyms doesn't result in this unexpected behaviour. Am I
>>>> missing
>>>>> something or did I discover a bug in Solr?
>>>>> 
>>>>> On a newly started solr with the techproducts core:
>>>>> 
>>>>> curl -X PUT -H 'Content-type:application/json' --data-binary
>>>>> 
>> '{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
>>>>> 
>>>> 
>> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
>>>>> curl -X DELETE
>>>>> 
>>>> 
>> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
>>>>> curl
>>>> http://localhost:8983/solr/admin/cores?action=RELOAD\&core=techproducts
>>>>> curl -X PUT -H 'Content-type:application/json' --data-binary
>>>>> 
>> '{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
>>>>> 
>>>> 
>> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
>>>>> 
>>>>> The second PUT request results in a status 500 with error
>>>>> msg "java.util.LinkedHashMap cannot be cast to java.util.List".
>>>>> 
>>>>> Similar requests for synonyms work fine, no matter how many times I
>>>> repeat
>>>>> the CREATE/DELETE/RELOAD cycle:
>>>>> 
>>>>> curl -X PUT -H 'Content-type:application/json' --data-binary
>>>>> 
>>>> 
>> '{"class":"org.apache.solr.rest.schema.analysis.ManagedSynonymGraphFilterFactory$SynonymManager"}'
>>>>> 
>> http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
>>>>> curl -X DELETE
>>>>> 
>> http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
>>>>> curl
>>>> http://localhost:8983/solr/admin/cores?action=RELOAD\&core=techproducts
>>>>> curl -X PUT -H 'Content-type:application/json' --data-binary
>>>>> 
>>>> 
>> '{"class":"org.apache.solr.rest.schema.analysis.ManagedSynonymGraphFilterFactory$SynonymManager"}'
>>>>> 
>> http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
>>>>> 
>>>>> Reloading after creating the Stopwords list but not after deleting it
>>>> works
>>>>> without error too on a fresh techproducts core (you'll have to remove
>> the
>>>>> directory from disk and create the core again after running the
>> previous
>>>>> commands).
>>>>> 
>>>>> curl -X PUT -H 'Content-type:application/json' --data-binary
>>>>> 
>> '{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
>>>>> 
>>>> 
>> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
>>>>> curl
>>>> http://localhost:8983/solr/admin/cores?action=RELOAD\&core=techproducts
>>>>> curl -X DELETE
>>>>> 
>>>> 
>> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
>>>>> curl -X PUT -H 'Content-type:application/json' --data-binary
>>>>> 
>> '{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
>>>>> 
>>>> 
>> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
>>>>> 
>>>>> And even curiouser, when doing a CREATE/DELETE for Stopwords, then a
>>>>> CREATE/DELETE for Synonyms, and only then a RELOAD of the core, the
>> cycle
>>>>> can be completed twice. (Again, on a freshly created techproducts
>> core.)
>>>>> Only the third attempt to create a list results in an error. Synonyms
>> can
>>>>> still be created and deleted repeatedly after this.
>>>>> 
>>>>> curl -X PUT -H 'Content-type:application/json' --data-binary
>>>>> 
>> '{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
>>>>> 
>>>> 
>> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
>>>>> curl -X DELETE
>>>>> 
>>>> 
>> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
>>>>> curl -X PUT -H 'Content-type:application/json' --data-binary
>>>>> 
>>>> 
>> '{"class":"org.apache.solr.rest.schema.analysis.ManagedSynonymGraphFilterFactory$SynonymManager"}'
>>>>> 
>> http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
>>>>> curl -X DELETE
>>>>> 
>> http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
>>>>> curl
>>>> http://localhost:8983/solr/admin/cores?action=RELOAD\&core=techproducts
>>>>> curl -X PUT -H 'Content-type:application/json' --data-binary
>>>>> 
>> '{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
>>>>> 
>>>> 
>> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
>>>>> curl -X DELETE
>>>>> 
>>>> 
>> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
>>>>> curl -X PUT -H 'Content-type:application/json' --data-binary
>>>>> 
>>>> 
>> '{"class":"org.apache.solr.rest.schema.analysis.ManagedSynonymGraphFilterFactory$SynonymManager"}'
>>>>> 
>> http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
>>>>> curl -X DELETE
>>>>> 
>> http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
>>>>> curl
>>>> http://localhost:8983/solr/admin/cores?action=RELOAD\&core=techproducts
>>>>> curl -X PUT -H 'Content-type:application/json' --data-binary
>>>>> 
>> '{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
>>>>> 
>>>> 
>> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
>>>>> 
>>>>> The same successes/errors occur when running each cycle against a
>>>> different
>>>>> core if the cores share the same configset.
>>>>> 
>>>>> Any ideas on what might be going wrong?
>>>> 
>>>> 
>> 
>> 


Re: Re-creating deleted Managed Stopwords lists results in error

Posted by David Hastings <ha...@gmail.com>.
interesting, i cant seem to find anything on Phrase IDF, dont suppose you
have a link or two i could look at by chance?

On Mon, Feb 17, 2020 at 1:48 PM Walter Underwood <wu...@wunderwood.org>
wrote:

> At Infoseek, we used “glue words” to build phrase tokens. It was really
> effective.
> Phrase IDF is powerful stuff.
>
> Luckily for you, the patent on that has expired. :-)
>
> wunder
> Walter Underwood
> wunder@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
> > On Feb 17, 2020, at 10:46 AM, David Hastings <
> hastings.recursive@gmail.com> wrote:
> >
> > i use stop words for building shingles into "interesting phrases" for my
> > machine teacher/students, so i wouldnt say theres no reason, however my
> use
> > case is very specific.  Otherwise yeah, theyre gone for all practical
> > reasons/search scenarios.
> >
> > On Mon, Feb 17, 2020 at 1:41 PM Walter Underwood <wu...@wunderwood.org>
> > wrote:
> >
> >> Why are you using stopwords? I would need a really, really good reason
> to
> >> use those.
> >>
> >> Stopwords are an obsolete technique from 16-bit processors. I’ve never
> >> used them and
> >> I’ve been a search engineer since 1997.
> >>
> >> wunder
> >> Walter Underwood
> >> wunder@wunderwood.org
> >> http://observer.wunderwood.org/  (my blog)
> >>
> >>> On Feb 17, 2020, at 7:31 AM, Thomas Corthals <th...@klascement.net>
> >> wrote:
> >>>
> >>> Hi
> >>>
> >>> I've run into an issue with creating a Managed Stopwords list that has
> >> the
> >>> same name as a previously deleted list. Going through the same flow
> with
> >>> Managed Synonyms doesn't result in this unexpected behaviour. Am I
> >> missing
> >>> something or did I discover a bug in Solr?
> >>>
> >>> On a newly started solr with the techproducts core:
> >>>
> >>> curl -X PUT -H 'Content-type:application/json' --data-binary
> >>>
> '{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
> >>>
> >>
> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
> >>> curl -X DELETE
> >>>
> >>
> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
> >>> curl
> >> http://localhost:8983/solr/admin/cores?action=RELOAD\&core=techproducts
> >>> curl -X PUT -H 'Content-type:application/json' --data-binary
> >>>
> '{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
> >>>
> >>
> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
> >>>
> >>> The second PUT request results in a status 500 with error
> >>> msg "java.util.LinkedHashMap cannot be cast to java.util.List".
> >>>
> >>> Similar requests for synonyms work fine, no matter how many times I
> >> repeat
> >>> the CREATE/DELETE/RELOAD cycle:
> >>>
> >>> curl -X PUT -H 'Content-type:application/json' --data-binary
> >>>
> >>
> '{"class":"org.apache.solr.rest.schema.analysis.ManagedSynonymGraphFilterFactory$SynonymManager"}'
> >>>
> http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
> >>> curl -X DELETE
> >>>
> http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
> >>> curl
> >> http://localhost:8983/solr/admin/cores?action=RELOAD\&core=techproducts
> >>> curl -X PUT -H 'Content-type:application/json' --data-binary
> >>>
> >>
> '{"class":"org.apache.solr.rest.schema.analysis.ManagedSynonymGraphFilterFactory$SynonymManager"}'
> >>>
> http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
> >>>
> >>> Reloading after creating the Stopwords list but not after deleting it
> >> works
> >>> without error too on a fresh techproducts core (you'll have to remove
> the
> >>> directory from disk and create the core again after running the
> previous
> >>> commands).
> >>>
> >>> curl -X PUT -H 'Content-type:application/json' --data-binary
> >>>
> '{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
> >>>
> >>
> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
> >>> curl
> >> http://localhost:8983/solr/admin/cores?action=RELOAD\&core=techproducts
> >>> curl -X DELETE
> >>>
> >>
> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
> >>> curl -X PUT -H 'Content-type:application/json' --data-binary
> >>>
> '{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
> >>>
> >>
> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
> >>>
> >>> And even curiouser, when doing a CREATE/DELETE for Stopwords, then a
> >>> CREATE/DELETE for Synonyms, and only then a RELOAD of the core, the
> cycle
> >>> can be completed twice. (Again, on a freshly created techproducts
> core.)
> >>> Only the third attempt to create a list results in an error. Synonyms
> can
> >>> still be created and deleted repeatedly after this.
> >>>
> >>> curl -X PUT -H 'Content-type:application/json' --data-binary
> >>>
> '{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
> >>>
> >>
> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
> >>> curl -X DELETE
> >>>
> >>
> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
> >>> curl -X PUT -H 'Content-type:application/json' --data-binary
> >>>
> >>
> '{"class":"org.apache.solr.rest.schema.analysis.ManagedSynonymGraphFilterFactory$SynonymManager"}'
> >>>
> http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
> >>> curl -X DELETE
> >>>
> http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
> >>> curl
> >> http://localhost:8983/solr/admin/cores?action=RELOAD\&core=techproducts
> >>> curl -X PUT -H 'Content-type:application/json' --data-binary
> >>>
> '{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
> >>>
> >>
> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
> >>> curl -X DELETE
> >>>
> >>
> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
> >>> curl -X PUT -H 'Content-type:application/json' --data-binary
> >>>
> >>
> '{"class":"org.apache.solr.rest.schema.analysis.ManagedSynonymGraphFilterFactory$SynonymManager"}'
> >>>
> http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
> >>> curl -X DELETE
> >>>
> http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
> >>> curl
> >> http://localhost:8983/solr/admin/cores?action=RELOAD\&core=techproducts
> >>> curl -X PUT -H 'Content-type:application/json' --data-binary
> >>>
> '{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
> >>>
> >>
> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
> >>>
> >>> The same successes/errors occur when running each cycle against a
> >> different
> >>> core if the cores share the same configset.
> >>>
> >>> Any ideas on what might be going wrong?
> >>
> >>
>
>

Re: Re-creating deleted Managed Stopwords lists results in error

Posted by Walter Underwood <wu...@wunderwood.org>.
At Infoseek, we used “glue words” to build phrase tokens. It was really effective.
Phrase IDF is powerful stuff.

Luckily for you, the patent on that has expired. :-)

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Feb 17, 2020, at 10:46 AM, David Hastings <ha...@gmail.com> wrote:
> 
> i use stop words for building shingles into "interesting phrases" for my
> machine teacher/students, so i wouldnt say theres no reason, however my use
> case is very specific.  Otherwise yeah, theyre gone for all practical
> reasons/search scenarios.
> 
> On Mon, Feb 17, 2020 at 1:41 PM Walter Underwood <wu...@wunderwood.org>
> wrote:
> 
>> Why are you using stopwords? I would need a really, really good reason to
>> use those.
>> 
>> Stopwords are an obsolete technique from 16-bit processors. I’ve never
>> used them and
>> I’ve been a search engineer since 1997.
>> 
>> wunder
>> Walter Underwood
>> wunder@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>> 
>>> On Feb 17, 2020, at 7:31 AM, Thomas Corthals <th...@klascement.net>
>> wrote:
>>> 
>>> Hi
>>> 
>>> I've run into an issue with creating a Managed Stopwords list that has
>> the
>>> same name as a previously deleted list. Going through the same flow with
>>> Managed Synonyms doesn't result in this unexpected behaviour. Am I
>> missing
>>> something or did I discover a bug in Solr?
>>> 
>>> On a newly started solr with the techproducts core:
>>> 
>>> curl -X PUT -H 'Content-type:application/json' --data-binary
>>> '{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
>>> 
>> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
>>> curl -X DELETE
>>> 
>> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
>>> curl
>> http://localhost:8983/solr/admin/cores?action=RELOAD\&core=techproducts
>>> curl -X PUT -H 'Content-type:application/json' --data-binary
>>> '{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
>>> 
>> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
>>> 
>>> The second PUT request results in a status 500 with error
>>> msg "java.util.LinkedHashMap cannot be cast to java.util.List".
>>> 
>>> Similar requests for synonyms work fine, no matter how many times I
>> repeat
>>> the CREATE/DELETE/RELOAD cycle:
>>> 
>>> curl -X PUT -H 'Content-type:application/json' --data-binary
>>> 
>> '{"class":"org.apache.solr.rest.schema.analysis.ManagedSynonymGraphFilterFactory$SynonymManager"}'
>>> http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
>>> curl -X DELETE
>>> http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
>>> curl
>> http://localhost:8983/solr/admin/cores?action=RELOAD\&core=techproducts
>>> curl -X PUT -H 'Content-type:application/json' --data-binary
>>> 
>> '{"class":"org.apache.solr.rest.schema.analysis.ManagedSynonymGraphFilterFactory$SynonymManager"}'
>>> http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
>>> 
>>> Reloading after creating the Stopwords list but not after deleting it
>> works
>>> without error too on a fresh techproducts core (you'll have to remove the
>>> directory from disk and create the core again after running the previous
>>> commands).
>>> 
>>> curl -X PUT -H 'Content-type:application/json' --data-binary
>>> '{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
>>> 
>> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
>>> curl
>> http://localhost:8983/solr/admin/cores?action=RELOAD\&core=techproducts
>>> curl -X DELETE
>>> 
>> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
>>> curl -X PUT -H 'Content-type:application/json' --data-binary
>>> '{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
>>> 
>> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
>>> 
>>> And even curiouser, when doing a CREATE/DELETE for Stopwords, then a
>>> CREATE/DELETE for Synonyms, and only then a RELOAD of the core, the cycle
>>> can be completed twice. (Again, on a freshly created techproducts core.)
>>> Only the third attempt to create a list results in an error. Synonyms can
>>> still be created and deleted repeatedly after this.
>>> 
>>> curl -X PUT -H 'Content-type:application/json' --data-binary
>>> '{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
>>> 
>> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
>>> curl -X DELETE
>>> 
>> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
>>> curl -X PUT -H 'Content-type:application/json' --data-binary
>>> 
>> '{"class":"org.apache.solr.rest.schema.analysis.ManagedSynonymGraphFilterFactory$SynonymManager"}'
>>> http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
>>> curl -X DELETE
>>> http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
>>> curl
>> http://localhost:8983/solr/admin/cores?action=RELOAD\&core=techproducts
>>> curl -X PUT -H 'Content-type:application/json' --data-binary
>>> '{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
>>> 
>> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
>>> curl -X DELETE
>>> 
>> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
>>> curl -X PUT -H 'Content-type:application/json' --data-binary
>>> 
>> '{"class":"org.apache.solr.rest.schema.analysis.ManagedSynonymGraphFilterFactory$SynonymManager"}'
>>> http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
>>> curl -X DELETE
>>> http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
>>> curl
>> http://localhost:8983/solr/admin/cores?action=RELOAD\&core=techproducts
>>> curl -X PUT -H 'Content-type:application/json' --data-binary
>>> '{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
>>> 
>> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
>>> 
>>> The same successes/errors occur when running each cycle against a
>> different
>>> core if the cores share the same configset.
>>> 
>>> Any ideas on what might be going wrong?
>> 
>> 


Re: Re-creating deleted Managed Stopwords lists results in error

Posted by David Hastings <ha...@gmail.com>.
i use stop words for building shingles into "interesting phrases" for my
machine teacher/students, so i wouldnt say theres no reason, however my use
case is very specific.  Otherwise yeah, theyre gone for all practical
reasons/search scenarios.

On Mon, Feb 17, 2020 at 1:41 PM Walter Underwood <wu...@wunderwood.org>
wrote:

> Why are you using stopwords? I would need a really, really good reason to
> use those.
>
> Stopwords are an obsolete technique from 16-bit processors. I’ve never
> used them and
> I’ve been a search engineer since 1997.
>
> wunder
> Walter Underwood
> wunder@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
> > On Feb 17, 2020, at 7:31 AM, Thomas Corthals <th...@klascement.net>
> wrote:
> >
> > Hi
> >
> > I've run into an issue with creating a Managed Stopwords list that has
> the
> > same name as a previously deleted list. Going through the same flow with
> > Managed Synonyms doesn't result in this unexpected behaviour. Am I
> missing
> > something or did I discover a bug in Solr?
> >
> > On a newly started solr with the techproducts core:
> >
> > curl -X PUT -H 'Content-type:application/json' --data-binary
> > '{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
> >
> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
> > curl -X DELETE
> >
> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
> > curl
> http://localhost:8983/solr/admin/cores?action=RELOAD\&core=techproducts
> > curl -X PUT -H 'Content-type:application/json' --data-binary
> > '{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
> >
> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
> >
> > The second PUT request results in a status 500 with error
> > msg "java.util.LinkedHashMap cannot be cast to java.util.List".
> >
> > Similar requests for synonyms work fine, no matter how many times I
> repeat
> > the CREATE/DELETE/RELOAD cycle:
> >
> > curl -X PUT -H 'Content-type:application/json' --data-binary
> >
> '{"class":"org.apache.solr.rest.schema.analysis.ManagedSynonymGraphFilterFactory$SynonymManager"}'
> > http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
> > curl -X DELETE
> > http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
> > curl
> http://localhost:8983/solr/admin/cores?action=RELOAD\&core=techproducts
> > curl -X PUT -H 'Content-type:application/json' --data-binary
> >
> '{"class":"org.apache.solr.rest.schema.analysis.ManagedSynonymGraphFilterFactory$SynonymManager"}'
> > http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
> >
> > Reloading after creating the Stopwords list but not after deleting it
> works
> > without error too on a fresh techproducts core (you'll have to remove the
> > directory from disk and create the core again after running the previous
> > commands).
> >
> > curl -X PUT -H 'Content-type:application/json' --data-binary
> > '{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
> >
> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
> > curl
> http://localhost:8983/solr/admin/cores?action=RELOAD\&core=techproducts
> > curl -X DELETE
> >
> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
> > curl -X PUT -H 'Content-type:application/json' --data-binary
> > '{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
> >
> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
> >
> > And even curiouser, when doing a CREATE/DELETE for Stopwords, then a
> > CREATE/DELETE for Synonyms, and only then a RELOAD of the core, the cycle
> > can be completed twice. (Again, on a freshly created techproducts core.)
> > Only the third attempt to create a list results in an error. Synonyms can
> > still be created and deleted repeatedly after this.
> >
> > curl -X PUT -H 'Content-type:application/json' --data-binary
> > '{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
> >
> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
> > curl -X DELETE
> >
> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
> > curl -X PUT -H 'Content-type:application/json' --data-binary
> >
> '{"class":"org.apache.solr.rest.schema.analysis.ManagedSynonymGraphFilterFactory$SynonymManager"}'
> > http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
> > curl -X DELETE
> > http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
> > curl
> http://localhost:8983/solr/admin/cores?action=RELOAD\&core=techproducts
> > curl -X PUT -H 'Content-type:application/json' --data-binary
> > '{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
> >
> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
> > curl -X DELETE
> >
> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
> > curl -X PUT -H 'Content-type:application/json' --data-binary
> >
> '{"class":"org.apache.solr.rest.schema.analysis.ManagedSynonymGraphFilterFactory$SynonymManager"}'
> > http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
> > curl -X DELETE
> > http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
> > curl
> http://localhost:8983/solr/admin/cores?action=RELOAD\&core=techproducts
> > curl -X PUT -H 'Content-type:application/json' --data-binary
> > '{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
> >
> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
> >
> > The same successes/errors occur when running each cycle against a
> different
> > core if the cores share the same configset.
> >
> > Any ideas on what might be going wrong?
>
>

Re: Re-creating deleted Managed Stopwords lists results in error

Posted by Walter Underwood <wu...@wunderwood.org>.
Why are you using stopwords? I would need a really, really good reason to use those.

Stopwords are an obsolete technique from 16-bit processors. I’ve never used them and
I’ve been a search engineer since 1997.

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Feb 17, 2020, at 7:31 AM, Thomas Corthals <th...@klascement.net> wrote:
> 
> Hi
> 
> I've run into an issue with creating a Managed Stopwords list that has the
> same name as a previously deleted list. Going through the same flow with
> Managed Synonyms doesn't result in this unexpected behaviour. Am I missing
> something or did I discover a bug in Solr?
> 
> On a newly started solr with the techproducts core:
> 
> curl -X PUT -H 'Content-type:application/json' --data-binary
> '{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
> curl -X DELETE
> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
> curl http://localhost:8983/solr/admin/cores?action=RELOAD\&core=techproducts
> curl -X PUT -H 'Content-type:application/json' --data-binary
> '{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
> 
> The second PUT request results in a status 500 with error
> msg "java.util.LinkedHashMap cannot be cast to java.util.List".
> 
> Similar requests for synonyms work fine, no matter how many times I repeat
> the CREATE/DELETE/RELOAD cycle:
> 
> curl -X PUT -H 'Content-type:application/json' --data-binary
> '{"class":"org.apache.solr.rest.schema.analysis.ManagedSynonymGraphFilterFactory$SynonymManager"}'
> http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
> curl -X DELETE
> http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
> curl http://localhost:8983/solr/admin/cores?action=RELOAD\&core=techproducts
> curl -X PUT -H 'Content-type:application/json' --data-binary
> '{"class":"org.apache.solr.rest.schema.analysis.ManagedSynonymGraphFilterFactory$SynonymManager"}'
> http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
> 
> Reloading after creating the Stopwords list but not after deleting it works
> without error too on a fresh techproducts core (you'll have to remove the
> directory from disk and create the core again after running the previous
> commands).
> 
> curl -X PUT -H 'Content-type:application/json' --data-binary
> '{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
> curl http://localhost:8983/solr/admin/cores?action=RELOAD\&core=techproducts
> curl -X DELETE
> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
> curl -X PUT -H 'Content-type:application/json' --data-binary
> '{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
> 
> And even curiouser, when doing a CREATE/DELETE for Stopwords, then a
> CREATE/DELETE for Synonyms, and only then a RELOAD of the core, the cycle
> can be completed twice. (Again, on a freshly created techproducts core.)
> Only the third attempt to create a list results in an error. Synonyms can
> still be created and deleted repeatedly after this.
> 
> curl -X PUT -H 'Content-type:application/json' --data-binary
> '{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
> curl -X DELETE
> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
> curl -X PUT -H 'Content-type:application/json' --data-binary
> '{"class":"org.apache.solr.rest.schema.analysis.ManagedSynonymGraphFilterFactory$SynonymManager"}'
> http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
> curl -X DELETE
> http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
> curl http://localhost:8983/solr/admin/cores?action=RELOAD\&core=techproducts
> curl -X PUT -H 'Content-type:application/json' --data-binary
> '{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
> curl -X DELETE
> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
> curl -X PUT -H 'Content-type:application/json' --data-binary
> '{"class":"org.apache.solr.rest.schema.analysis.ManagedSynonymGraphFilterFactory$SynonymManager"}'
> http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
> curl -X DELETE
> http://localhost:8983/solr/techproducts/schema/analysis/synonyms/testmap
> curl http://localhost:8983/solr/admin/cores?action=RELOAD\&core=techproducts
> curl -X PUT -H 'Content-type:application/json' --data-binary
> '{"class":"org.apache.solr.rest.schema.analysis.ManagedWordSetResource"}'
> http://localhost:8983/solr/techproducts/schema/analysis/stopwords/testlist
> 
> The same successes/errors occur when running each cycle against a different
> core if the cores share the same configset.
> 
> Any ideas on what might be going wrong?