You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Christian Kesselheim (JIRA)" <ji...@apache.org> on 2010/09/29 10:45:36 UTC

[jira] Issue Comment Edited: (SOLR-1883) Highlighting failure caused by InvalidTokenOffsetsException

    [ https://issues.apache.org/jira/browse/SOLR-1883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916078#action_12916078 ] 

Christian Kesselheim edited comment on SOLR-1883 at 9/29/10 4:44 AM:
---------------------------------------------------------------------

I'm experiencing the same problem when using the solr.PatternReplaceCharFilterFactory filter on the todays nightly build of SOLR 1.5.

      was (Author: ckesselh):
    I'm experiencing the same problem when using the solr.PatternReplaceCharFilterFactory filter on the current nightly build of SOLR 1.5.
  
> Highlighting failure caused by InvalidTokenOffsetsException
> -----------------------------------------------------------
>
>                 Key: SOLR-1883
>                 URL: https://issues.apache.org/jira/browse/SOLR-1883
>             Project: Solr
>          Issue Type: Bug
>          Components: highlighter
>    Affects Versions: 1.4
>         Environment: {code:title=java}
> Java(TM) SE Runtime Environment (build 1.6.0_18-b07)
> Java HotSpot(TM) 64-Bit Server VM (build 16.0-b13, mixed mode)
> {code}
> {code:title=solr lib manifest}
> Manifest-Version: 1.0
> Ant-Version: Apache Ant 1.7.0
> Created-By: 14.1-b02-90 (Apple Inc.)
> Extension-Name: org.apache.solr
> Specification-Title: Apache Solr Search Server
> Specification-Version: 1.4.0
> Specification-Vendor: The Apache Software Foundation
> Implementation-Title: org.apache.solr
> Implementation-Version: 1.4.0 833479 - grantingersoll - 2009-11-06 12:
>  33:40
> Implementation-Vendor: The Apache Software Foundation
> X-Compile-Source-JDK: 1.5
> X-Compile-Target-JDK: 1.5
> {code}
> {code:title=OS}
> Linux myhost 2.6.18-164.el5 #1 SMP Thu Sep 3 03:28:30 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
> {code}
>            Reporter: Luke Forehand
>         Attachments: schema.xml, test_doc_for_invalid_token_offsets_exception.xml
>
>
> This issue seems to be the same as a previous issue that was bulk closed in solr 1.4 https://issues.apache.org/jira/browse/SOLR-1404, and I see someone reported this bug in lucene 2.9.1 https://issues.apache.org/jira/browse/LUCENE-2208 We are experiencing this issue as well.  
> I have pasted the important part of our schema.xml and the solr exception.  I have also attached the document that fails when queried as a highlight query.  The invalid token seems to be 'system' which is the very last token in the document field if you look at the attached file.
> {code:title=schema.xml}
> <?xml version="1.0" encoding="UTF-8"?>
> <schema name="xxx" version="1.1">
> 	<types>
> 		<fieldType name="scrubbedText" class="solr.TextField" positionIncrementGap="100">
> 			<analyzer>
> 				<tokenizer class="solr.StandardTokenizerFactory" />
> 				<charFilter class="solr.HTMLStripCharFilterFactory" />
> 				<filter class="solr.StandardFilterFactory" />
> 				<filter class="solr.LowerCaseFilterFactory" />
> 				<filter class="solr.StopFilterFactory" />
> 			</analyzer>
> 		</fieldType>
> 		...
> 	</types>
> 	<fields>
> 		<field name="id" type="string" stored="true" indexed="true" />
> 		<field name="textScrubbed" type="scrubbedText" stored="true" indexed="true" />
> 		...
> 	</fields>
> 	<uniqueKey>id</uniqueKey>
> 	<defaultSearchField>textScrubbed</defaultSearchField>
> </schema>
> {code}
> {code:title=solr.log exception}
> Apr 13, 2010 3:08:35 AM org.apache.solr.common.SolrException log
> SEVERE: org.apache.solr.common.SolrException: org.apache.lucene.search.highlight.InvalidTokenOffsetsException: Token system exceeds length of provided text sized 17063
>         at org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:342)
>         at org.apache.solr.handler.component.HighlightComponent.process(HighlightComponent.java:89)
>         at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
>         at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>         at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
>         at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
>         at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
>         at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
>         at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>         at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
>         at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
>         at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
>         at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
>         at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
>         at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
>         at org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:859)
>         at org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:574)
>         at org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1527)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: org.apache.lucene.search.highlight.InvalidTokenOffsetsException: Token system exceeds length of provided text sized 17063
>         at org.apache.lucene.search.highlight.Highlighter.getBestTextFragments(Highlighter.java:254)
>         at org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:335)
>         ... 18 more
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org