You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Roberto Nieto <r....@almatech.es> on 2008/07/08 14:07:24 UTC

problems with SpellCheckComponent

Hi,

I have downloaded the trunk version today and I´m having problems with the
SpellCheckComponent. Its any known bug?

This is my configuration:
#############################################################
<searchComponent name="spellcheck"
  class="org.apache.solr.handler.component.SpellCheckComponent">
  <lst name="defaults">
   <!-- omp = Only More Popular -->
   <str name="spellcheck.onlyMorePopular">false</str>
   <!-- exr = Extended Results -->
   <str name="spellcheck.extendedResults">false</str>
   <!--  The number of suggestions to return -->
   <str name="spellcheck.count">1</str>
  </lst>
  <str name="queryAnalyzerFieldType">text</str>

  <lst name="spellchecker">
   <str name="name">default</str>
   <str name="field">title</str>
   <str name="spellcheckIndexDir">spellchecker_defaultXX</str>

  </lst>
 </searchComponent>

 <queryConverter name="queryConverter"
  class="org.apache.solr.spelling.SpellingQueryConverter" />

 <requestHandler name="/spellCheckCompRH"
  class="org.apache.solr.handler.component.SearchHandler">
  <arr name="last-components">
   <str>spellcheck</str>
  </arr>
 </requestHandler>
##########################################################

SCHEMA.XML:... <field name="*title*" type="*text*" indexed="*true*" stored="
*true*" /> ...

When I made:
http://localhost:8080/solr/spellCheckCompRH?q=*:*&spellcheck.q=ruck&spellcheck=true

I have this exception:

Estado HTTP 500 - null java.lang.NullPointerException at
org.apache.solr.handler.component.SpellCheckComponent.getTokens(SpellCheckComponent.java:217)
at
org.apache.solr.handler.component.SpellCheckComponent.process(SpellCheckComponent.java:184)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:156)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:128)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1025) at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:272)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:263)
at
org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:852)
at
org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:584)
at org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1508)
at java.lang.Thread.run(Unknown Source)

Any help will be very usefull for me. Thanks for your attention.

Rober

Re: problems with SpellCheckComponent

Posted by Shalin Shekhar Mangar <sh...@gmail.com>.
Hi Roberto,

1. Why do you have those asterisk characters in your schema field
definition?
2. Did you do a spellcheck.build=true before issuing the first spell check
request?

Also, as per the latest docs on the wiki (
http://wiki.apache.org/solr/SpellCheckComponent ), the defaults section
should be moved to the /spellCheckCompRH section.

On Tue, Jul 8, 2008 at 5:37 PM, Roberto Nieto <r....@almatech.es> wrote:

> Hi,
>
> I have downloaded the trunk version today and I´m having problems with the
> SpellCheckComponent. Its any known bug?
>
> This is my configuration:
> #############################################################
> <searchComponent name="spellcheck"
>  class="org.apache.solr.handler.component.SpellCheckComponent">
>  <lst name="defaults">
>   <!-- omp = Only More Popular -->
>   <str name="spellcheck.onlyMorePopular">false</str>
>   <!-- exr = Extended Results -->
>   <str name="spellcheck.extendedResults">false</str>
>   <!--  The number of suggestions to return -->
>   <str name="spellcheck.count">1</str>
>  </lst>
>  <str name="queryAnalyzerFieldType">text</str>
>
>  <lst name="spellchecker">
>   <str name="name">default</str>
>   <str name="field">title</str>
>   <str name="spellcheckIndexDir">spellchecker_defaultXX</str>
>
>  </lst>
>  </searchComponent>
>
>  <queryConverter name="queryConverter"
>  class="org.apache.solr.spelling.SpellingQueryConverter" />
>
>  <requestHandler name="/spellCheckCompRH"
>  class="org.apache.solr.handler.component.SearchHandler">
>  <arr name="last-components">
>   <str>spellcheck</str>
>  </arr>
>  </requestHandler>
> ##########################################################
>
> SCHEMA.XML:... <field name="*title*" type="*text*" indexed="*true*"
> stored="
> *true*" /> ...
>
> When I made:
>
> http://localhost:8080/solr/spellCheckCompRH?q=*:*&spellcheck.q=ruck&spellcheck=true
>
> I have this exception:
>
> Estado HTTP 500 - null java.lang.NullPointerException at
>
> org.apache.solr.handler.component.SpellCheckComponent.getTokens(SpellCheckComponent.java:217)
> at
>
> org.apache.solr.handler.component.SpellCheckComponent.process(SpellCheckComponent.java:184)
> at
>
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:156)
> at
>
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:128)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1025) at
>
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
> at
>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:272)
> at
>
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
> at
>
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> at
>
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
> at
>
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175)
> at
>
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
> at
>
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
> at
>
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
> at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:263)
> at
>
> org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:852)
> at
>
> org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:584)
> at org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1508)
> at java.lang.Thread.run(Unknown Source)
>
> Any help will be very usefull for me. Thanks for your attention.
>
> Rober
>



-- 
Regards,
Shalin Shekhar Mangar.

Re: problems with SpellCheckComponent

Posted by Norberto Meijome <fr...@meijome.net>.
On Tue, 8 Jul 2008 21:10:51 +0530
"Shalin Shekhar Mangar" <sh...@gmail.com> wrote:

> Also note that you'll need to specify spellcheck.build=true only on the
> first request when it will build the spell check index. The subsequent
> requests need not have spellcheck.build=true.

as a matter of fact, you won't want to have spellchecker.build=true
in every request, as it will impact in your server's performance ... impact may
be minimal if SOLR can compare timestamps of both spellchecker and main index
and avoid rebuilding SP..I don't know if this is how it is implemented.

You really only want to do rebuild after a commit.

B

_________________________
{Beto|Norberto|Numard} Meijome

"Any society that would give up a little liberty to gain a little security will
deserve neither and lose both." Benjamin Franklin

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.

Re: problems with SpellCheckComponent

Posted by Shalin Shekhar Mangar <sh...@gmail.com>.
The spellcheck.q parameter is optional. However, the q parameter is
compulsory. So you can write q=macrosoft and avoid spellcheck.q altogether.
The difference behind it is that spellcheck.q is used if present and it uses
the query analyzer of the Solr field used to build the index whereas, the q
parameter uses a WhitespaceAnalyzer on the queries.

Also note that you'll need to specify spellcheck.build=true only on the
first request when it will build the spell check index. The subsequent
requests need not have spellcheck.build=true.

On Tue, Jul 8, 2008 at 9:03 PM, <r....@almatech.es> wrote:

> Hi,
>
> Thanks for your help.
>
> I can't understand the part when Geoff says that "I've never seen it happen
> with just the q or just the spellcheck.q fields in my query".
>
> That's means that I can do, for example:
>        http://192.168.92.5:8080/solr/spellCheckCompRH?
> spellcheck.q=macrosoft&spellcheck=true&spellcheck.build=true<http://192.168.92.5:8080/solr/spellCheckCompRH?spellcheck.q=macrosoft&spellcheck=true&spellcheck.build=true>
>
> I usually do things like:
>
>
> http://192.168.92.5:8080/solr/spellCheckCompRH?q=a&spellcheck.q=macrosoft&sp
> ellcheck=true&spellcheck.build=true<http://192.168.92.5:8080/solr/spellCheckCompRH?q=a&spellcheck.q=macrosoft&spellcheck=true&spellcheck.build=true>
>
> I don't know if I am understanding correctly but if I try the first thing I
> have this exception:
>
> Estado HTTP 500 - null java.lang.NullPointerException at
> org.apache.solr.common.util.StrUtils.splitSmart(StrUtils.java:36) at
> org.apache.solr.search.OldLuceneQParser.parse(LuceneQParserPlugin.java:104)
> at org.apache.solr.search.QParser.getQuery(QParser.java:87) at
>
> org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java
> :82) at
>
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHand
> ler.java:135) at
>
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.
> java:128) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1025) at
>
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:3
> 38) at
>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:
> 272) at
>
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Application
> FilterChain.java:235) at
>
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterCh
> ain.java:206) at
>
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.ja
> va:233) at
>
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.ja
> va:175) at
>
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128
> ) at
>
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102
> ) at
>
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java
> :109) at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:263)
> at
>
> org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:
> 852) at
>
> org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(H
> ttp11AprProtocol.java:584) at
> org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1508) at
> java.lang.Thread.run(Unknown Source)
>
>
> I hope this help you.
> If you need some kind of test don't doubt to tell me.
>
> Rober.
>
> -----Mensaje original-----
> De: Geoffrey Young [mailto:geoff@modperlcookbook.org]
> Enviado el: martes, 08 de julio de 2008 16:58
> Para: solr-user@lucene.apache.org
> Asunto: Re: problems with SpellCheckComponent
>
>
>
> Shalin Shekhar Mangar wrote:
> > Hi Geoff,
> >
> > I can't find anything in the code which would give this exception when
> both
> > q and spellcheck.q is specified. Though, this exception is certainly
> > possible when you restart solr. Anyways, I'll look into it more deeply.
>
> great, thanks.
>
> >
> > There are a few ways in which we can improve this component. For example
> a
> > lot of this trouble can go away if we can reload the spell index on
> startup
> > if it exists or build it if it does not exist (SOLR-593 would need to be
> > resolved for this). With SOLR-605 committed, we can now add an option to
> > re-build the index (built from Solr fields) on commits by adding a
> listener
> > using the API. There are a few issues with collation which are being
> handled
> > in SOLR-606.
> >
> > I'll open new issues to track these items. Please bear with us since this
> is
> > a new component and may take a few iterations to stabilize. Thank you for
> > helping us find these issues :)
>
> np - this is a great feature to have and it's going to save me some
> effort as we prepare for deployment, so it's worth taking the time to
> work out the bugs.
>
> thanks for your effort.
>
> --Geoff
>
>


-- 
Regards,
Shalin Shekhar Mangar.

RE: problems with SpellCheckComponent

Posted by r....@almatech.es.
Hi,

Thanks for your help.

I can't understand the part when Geoff says that "I've never seen it happen
with just the q or just the spellcheck.q fields in my query".

That's means that I can do, for example:
	http://192.168.92.5:8080/solr/spellCheckCompRH?
spellcheck.q=macrosoft&spellcheck=true&spellcheck.build=true

I usually do things like:
	
http://192.168.92.5:8080/solr/spellCheckCompRH?q=a&spellcheck.q=macrosoft&sp
ellcheck=true&spellcheck.build=true

I don't know if I am understanding correctly but if I try the first thing I
have this exception:

Estado HTTP 500 - null java.lang.NullPointerException at
org.apache.solr.common.util.StrUtils.splitSmart(StrUtils.java:36) at
org.apache.solr.search.OldLuceneQParser.parse(LuceneQParserPlugin.java:104)
at org.apache.solr.search.QParser.getQuery(QParser.java:87) at
org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java
:82) at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHand
ler.java:135) at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.
java:128) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1025) at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:3
38) at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:
272) at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Application
FilterChain.java:235) at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterCh
ain.java:206) at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.ja
va:233) at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.ja
va:175) at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128
) at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102
) at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java
:109) at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:263)
at
org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:
852) at
org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(H
ttp11AprProtocol.java:584) at
org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1508) at
java.lang.Thread.run(Unknown Source)


I hope this help you.
If you need some kind of test don't doubt to tell me.

Rober.

-----Mensaje original-----
De: Geoffrey Young [mailto:geoff@modperlcookbook.org] 
Enviado el: martes, 08 de julio de 2008 16:58
Para: solr-user@lucene.apache.org
Asunto: Re: problems with SpellCheckComponent



Shalin Shekhar Mangar wrote:
> Hi Geoff,
> 
> I can't find anything in the code which would give this exception when
both
> q and spellcheck.q is specified. Though, this exception is certainly
> possible when you restart solr. Anyways, I'll look into it more deeply.

great, thanks.

> 
> There are a few ways in which we can improve this component. For example a
> lot of this trouble can go away if we can reload the spell index on
startup
> if it exists or build it if it does not exist (SOLR-593 would need to be
> resolved for this). With SOLR-605 committed, we can now add an option to
> re-build the index (built from Solr fields) on commits by adding a
listener
> using the API. There are a few issues with collation which are being
handled
> in SOLR-606.
> 
> I'll open new issues to track these items. Please bear with us since this
is
> a new component and may take a few iterations to stabilize. Thank you for
> helping us find these issues :)

np - this is a great feature to have and it's going to save me some 
effort as we prepare for deployment, so it's worth taking the time to 
work out the bugs.

thanks for your effort.

--Geoff


Re: problems with SpellCheckComponent

Posted by Geoffrey Young <ge...@modperlcookbook.org>.

Shalin Shekhar Mangar wrote:
> Hi Geoff,
> 
> I can't find anything in the code which would give this exception when both
> q and spellcheck.q is specified. Though, this exception is certainly
> possible when you restart solr. Anyways, I'll look into it more deeply.

great, thanks.

> 
> There are a few ways in which we can improve this component. For example a
> lot of this trouble can go away if we can reload the spell index on startup
> if it exists or build it if it does not exist (SOLR-593 would need to be
> resolved for this). With SOLR-605 committed, we can now add an option to
> re-build the index (built from Solr fields) on commits by adding a listener
> using the API. There are a few issues with collation which are being handled
> in SOLR-606.
> 
> I'll open new issues to track these items. Please bear with us since this is
> a new component and may take a few iterations to stabilize. Thank you for
> helping us find these issues :)

np - this is a great feature to have and it's going to save me some 
effort as we prepare for deployment, so it's worth taking the time to 
work out the bugs.

thanks for your effort.

--Geoff

Re: problems with SpellCheckComponent

Posted by Shalin Shekhar Mangar <sh...@gmail.com>.
Hi Geoff,

I can't find anything in the code which would give this exception when both
q and spellcheck.q is specified. Though, this exception is certainly
possible when you restart solr. Anyways, I'll look into it more deeply.

There are a few ways in which we can improve this component. For example a
lot of this trouble can go away if we can reload the spell index on startup
if it exists or build it if it does not exist (SOLR-593 would need to be
resolved for this). With SOLR-605 committed, we can now add an option to
re-build the index (built from Solr fields) on commits by adding a listener
using the API. There are a few issues with collation which are being handled
in SOLR-606.

I'll open new issues to track these items. Please bear with us since this is
a new component and may take a few iterations to stabilize. Thank you for
helping us find these issues :)

On Tue, Jul 8, 2008 at 6:32 PM, Geoffrey Young <ge...@modperlcookbook.org>
wrote:

>
>  When I made:
>>
>> http://localhost:8080/solr/spellCheckCompRH?q=*:*&spellcheck.q=ruck&spellcheck=true
>>
>> I have this exception:
>>
>> Estado HTTP 500 - null java.lang.NullPointerException at
>>
>> org.apache.solr.handler.component.SpellCheckComponent.getTokens(SpellCheckComponent.java:217)
>>
>
>
> I see this all the time - to the point where I wonder how stable the new
> component is.
>
> I've *think* I've traced it to
>
>  o the presence of both q *and* spellcheck.q
>  o and *any* restart of solr without re-issuing spellcheck.build=true
>
> I haven't been using any form of spellchecker for long, but I'm reasonably
> sure that I didn't need to rebuild on every restart.  I also used to think
> it was changes to schema.xml (and not a simple restart) that caused the
> issue, but I've seen the exception with no changes. I've also seen the
> exception pop up without a restart when the server sits overnight (last
> query of the day ok, go to sleep, query again in the morning and *boom*)
>
> but regardless of restart issues, I've never seen it happen with just the q
> or just the spellcheck.q fields in my query - it's always when they're both
> there.
>
> --Geoff
>



-- 
Regards,
Shalin Shekhar Mangar.

Re: problems with SpellCheckComponent

Posted by Geoffrey Young <ge...@modperlcookbook.org>.
> When I made:
> http://localhost:8080/solr/spellCheckCompRH?q=*:*&spellcheck.q=ruck&spellcheck=true
> 
> I have this exception:
> 
> Estado HTTP 500 - null java.lang.NullPointerException at
> org.apache.solr.handler.component.SpellCheckComponent.getTokens(SpellCheckComponent.java:217)


I see this all the time - to the point where I wonder how stable the new 
component is.

I've *think* I've traced it to

   o the presence of both q *and* spellcheck.q
   o and *any* restart of solr without re-issuing spellcheck.build=true

I haven't been using any form of spellchecker for long, but I'm 
reasonably sure that I didn't need to rebuild on every restart.  I also 
used to think it was changes to schema.xml (and not a simple restart) 
that caused the issue, but I've seen the exception with no changes. 
I've also seen the exception pop up without a restart when the server 
sits overnight (last query of the day ok, go to sleep, query again in 
the morning and *boom*)

but regardless of restart issues, I've never seen it happen with just 
the q or just the spellcheck.q fields in my query - it's always when 
they're both there.

--Geoff