You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Hoss Man (JIRA)" <ji...@apache.org> on 2014/02/21 06:28:20 UTC
[jira] [Commented] (SOLR-5759) increasing hl.fragsize loses part of the search term

    [ https://issues.apache.org/jira/browse/SOLR-5759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907976#comment-13907976 ] 

Hoss Man commented on SOLR-5759:
--------------------------------

bq. Let me know if there is any other information I can supply, or checks I can perform.

eric: the most useful ting you could provide us to help understand this problem is self contained reproducible steps -- ie: that include all the neccessary configs & data.  

If you're comfortable writing a JUnit test that's by far the most helpful, but almost as useful would be if you could just give an example of a single document that can be indexed against the example schema.xml, followed by a sample query that shows the problem.  (or, if you are unable to reproduce the problem with the example configs, please attach a full set of configs that can be used to reproduce the problem)



> increasing hl.fragsize loses part of the search term
> ----------------------------------------------------
>
>                 Key: SOLR-5759
>                 URL: https://issues.apache.org/jira/browse/SOLR-5759
>             Project: Solr
>          Issue Type: Bug
>          Components: highlighter
>    Affects Versions: 4.4
>         Environment: Ubuntu 12.04
>            Reporter: eric casteleijn
>
> When using the highlighter, and increasing the fragsize from 100 (the default) to 200, sometimes the search term is no longer entirely contained by the returned fragment, even though it was in the smaller snippet.
> For instance:
> http://host/solr/index/select?q=("Tony+Yet"+AND+exact_text:"Tony+Yet")&wt=json&indent=true&hl=true&hl.fl=title,summary,extracted_text&hl.simple.pre=<em>&hl.simple.post=</em>&hl.fragsize=100
> results in the fragment:
> "7618861":{
>       "extracted_text":[" enterprise forward.\n\n<em>Tony</em> <em>Yet</em>, one of the centre's organisers, explains: \"I think what Hong Kong needs"]},
> whereas:
> http://host/solr/index/select?q=("Tony+Yet"+AND+exact_text:"Tony+Yet")&wt=json&indent=true&hl=true&hl.fl=title,summary,extracted_text&hl.simple.pre=<em>&hl.simple.post=</em>&hl.fragsize=200
> results in:
> "7618861":{
>       "extracted_text":[" interested in social issues, as well as mentorship for upcoming enterprises.\n\nAs in the UK, it is also creating the community of people, skills and ideas that is needed to push social enterprise forward.\n\n<em>Tony</em>"]},
> Both reference roughly the same position from the same field, but I can't for the life of me imagine why the larger fragment would shift to the left so far as to drop half of the search term.
> If desirable, I can upload the entire json results for both requests.
> Let me know if there is any other information I can supply, or checks I can perform.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org