You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by andrew <an...@digicol.de> on 2012/03/01 21:14:22 UTC
Re: search.highlight.InvalidTokenOffsetsException in Solr 3.5
I have the same problem. This happens only for some documents in the index.
Like sharadgaur, the problem ceased when I removed
ReversedWildcardFilterFactory from my analysis chain,
HTMLStripCharFilterFactory has been there before and after.
I am running branch-3.6 r1238628. As far as I can tell, this already has the
fixes from LUCENE-2208 / LUCENE-3690.
--
View this message in context: http://lucene.472066.n3.nabble.com/search-highlight-InvalidTokenOffsetsException-in-Solr-3-5-tp3560997p3791598.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: search.highlight.InvalidTokenOffsetsException in Solr 3.5
Posted by Ahmet Arslan <io...@yahoo.com>.
> But - the wiki page has a foot note that says "a tokenizer
> must be defined
> for the field, but it doesn't need to be indexed". The body
> field has the
> type "dcx_text" which has a tokenizer.
>
> Is the documentation wrong here or am I misunderstanding
> something?
Ah, I never read that note. (just looking on the table).
I think you are right, I can generate snippet from the following field:
<field name="body" type="dcx_text" stored="true" indexed="false" multiValued="true"/>
Re: search.highlight.InvalidTokenOffsetsException in Solr 3.5
Posted by andrew <an...@digicol.de>.
Ah, ok - thank you for looking at it.
But - the wiki page has a foot note that says "a tokenizer must be defined
for the field, but it doesn't need to be indexed". The body field has the
type "dcx_text" which has a tokenizer.
Is the documentation wrong here or am I misunderstanding something?
--
View this message in context: http://lucene.472066.n3.nabble.com/search-highlight-InvalidTokenOffsetsException-in-Solr-3-5-tp3560997p3793706.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: search.highlight.InvalidTokenOffsetsException in Solr 3.5
Posted by Ahmet Arslan <io...@yahoo.com>.
> Ahmet, this is a good find. Can we still open a JIRA issue
> so that a
> more useful exception is thrown here?
Robert, I created SOLR-3193 and created a test using Andrew's files.
Re: search.highlight.InvalidTokenOffsetsException in Solr 3.5
Posted by Robert Muir <rc...@gmail.com>.
On Fri, Mar 2, 2012 at 9:41 AM, Ahmet Arslan <io...@yahoo.com> wrote:
>
>> Robert, I just tried with
>> 3.6-SNAPSHOT 1296203 from svn - the problem is
>> still there.
>>
>> I am just about to leave for a vacation. I'll try to open a
>> JIRA issue this
>> evening.
>
> Andrew, thanks for providing files. I also re-produced it.
>
> But cause of the exception is that you are trying to highlight on a field (body) that is not indexed.
>
> To enable highlighting you need both indexed="true" and stored="true" .
> http://wiki.apache.org/solr/FieldOptionsByUseCase
>
> I changed definition of body field from indexed="false" to indexed="true" and it is working now.
>
> But for the record (with indexed="false"), it is weird that it produces snippet in the first request, and then fails in the second request.
>
>
Ahmet, this is a good find. Can we still open a JIRA issue so that a
more useful exception is thrown here?
--
lucidimagination.com
Re: search.highlight.InvalidTokenOffsetsException in Solr 3.5
Posted by Ahmet Arslan <io...@yahoo.com>.
> Robert, I just tried with
> 3.6-SNAPSHOT 1296203 from svn - the problem is
> still there.
>
> I am just about to leave for a vacation. I'll try to open a
> JIRA issue this
> evening.
Andrew, thanks for providing files. I also re-produced it.
But cause of the exception is that you are trying to highlight on a field (body) that is not indexed.
To enable highlighting you need both indexed="true" and stored="true" .
http://wiki.apache.org/solr/FieldOptionsByUseCase
I changed definition of body field from indexed="false" to indexed="true" and it is working now.
But for the record (with indexed="false"), it is weird that it produces snippet in the first request, and then fails in the second request.
Re: search.highlight.InvalidTokenOffsetsException in Solr 3.5
Posted by andrew <an...@digicol.de>.
Robert, I just tried with 3.6-SNAPSHOT 1296203 from svn - the problem is
still there.
I am just about to leave for a vacation. I'll try to open a JIRA issue this
evening.
--
View this message in context: http://lucene.472066.n3.nabble.com/search-highlight-InvalidTokenOffsetsException-in-Solr-3-5-tp3560997p3793593.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: search.highlight.InvalidTokenOffsetsException in Solr 3.5
Posted by andrew <an...@digicol.de>.
I posted the files here: http://www.mediafire.com/?z43a5qyfvz4zxp1
--
View this message in context: http://lucene.472066.n3.nabble.com/search-highlight-InvalidTokenOffsetsException-in-Solr-3-5-tp3560997p3793496.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: search.highlight.InvalidTokenOffsetsException in Solr 3.5
Posted by Ahmet Arslan <io...@yahoo.com>.
> I think it is not a good idea to post the Solr <add/>
> XML here - it is very
> long (text extract of a newspaper page) and may not
> reproduce verbatim
> (whitespace etc.) if I paste it here.
>
> iorixxx, koji - is it ok if I send the necessary artifacts
> (add XML, schema,
> config) via email?
I saw people using http://pastebin.com/ for this purposes before. Can you provide your full search URL too?
Re: search.highlight.InvalidTokenOffsetsException in Solr 3.5
Posted by Robert Muir <rc...@gmail.com>.
On Fri, Mar 2, 2012 at 7:37 AM, andrew <an...@digicol.de> wrote:
> I was able to create a test case.
>
> We are querying ranges of documents. When I tried to isolate the document
> that causes trouble, I found it happens with exactly every second request
> only for a single document query (it fails constantly when requesting a
> range of documents where that document is included). I could also reproduce
> the exception with only that single document in the index.
>
> I think it is not a good idea to post the Solr <add/> XML here - it is very
> long (text extract of a newspaper page) and may not reproduce verbatim
> (whitespace etc.) if I paste it here.
>
> iorixxx, koji - is it ok if I send the necessary artifacts (add XML, schema,
> config) via email?
>
You can also open a jira issue
(https://issues.apache.org/jira/browse/SOLR), and upload everything as
attachments.
I would also be very interested if you can test a nightly 3.6 build
(https://builds.apache.org/job/Solr-3.x/lastSuccessfulBuild/artifact/artifacts/)
There have been *numerous* offsets bugs fixed in 3.6 in a variety of
tokenizers/tokenfilters besides the HTMLStripCharFilter:
https://issues.apache.org/jira/browse/LUCENE-3642
https://issues.apache.org/jira/browse/SOLR-2891
https://issues.apache.org/jira/browse/LUCENE-3717
--
lucidimagination.com
Re: search.highlight.InvalidTokenOffsetsException in Solr 3.5
Posted by andrew <an...@digicol.de>.
I was able to create a test case.
We are querying ranges of documents. When I tried to isolate the document
that causes trouble, I found it happens with exactly every second request
only for a single document query (it fails constantly when requesting a
range of documents where that document is included). I could also reproduce
the exception with only that single document in the index.
I think it is not a good idea to post the Solr <add/> XML here - it is very
long (text extract of a newspaper page) and may not reproduce verbatim
(whitespace etc.) if I paste it here.
iorixxx, koji - is it ok if I send the necessary artifacts (add XML, schema,
config) via email?
--
View this message in context: http://lucene.472066.n3.nabble.com/search-highlight-InvalidTokenOffsetsException-in-Solr-3-5-tp3560997p3793347.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: search.highlight.InvalidTokenOffsetsException in Solr 3.5
Posted by Koji Sekiguchi <ko...@r.email.ne.jp>.
(12/03/02 6:05), Ahmet Arslan wrote:
>> I have the same problem. This happens
>> only for some documents in the index.
>
> Andrew, can you provide a document string and a query pair? I will try to re-produce the exception. Then we can create a test case that fails. Others can look into it.
+1. Please do it!
koji
--
Query Log Visualizer for Apache Solr
http://soleami.com/
Re: search.highlight.InvalidTokenOffsetsException in Solr 3.5
Posted by Ahmet Arslan <io...@yahoo.com>.
> I have the same problem. This happens
> only for some documents in the index.
Andrew, can you provide a document string and a query pair? I will try to re-produce the exception. Then we can create a test case that fails. Others can look into it.