You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Thomas Mueller (JIRA)" <ji...@apache.org> on 2015/11/27 09:52:11 UTC

[jira] [Commented] (OAK-3674) Search Excerpt highlighting is not correct

    [ https://issues.apache.org/jira/browse/OAK-3674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15029631#comment-15029631 ] 

Thomas Mueller commented on OAK-3674:
-------------------------------------

This is a limitation of the current SimpleExcerptProvider: it ignores word boundaries.

Could you try quoting the search tokens? That is, use double quotes inside the single quotes:

{noformat}
//*[jcr:contains(., '"conflict of interest"')]/rep:excerpt(.)
{noformat}

I think ultimately, it doesn't make sense to try to improve SimpleExcerptProvider. Instead, we should probably use the Lucene highlighter instead, which is available in OAK-3580. So I will close this issue as duplicate of OAK-3580.


> Search Excerpt highlighting is not correct
> ------------------------------------------
>
>                 Key: OAK-3674
>                 URL: https://issues.apache.org/jira/browse/OAK-3674
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: query
>    Affects Versions: 1.0.18, 1.0.23
>            Reporter: Srijan Bhatnagar
>
> We have the following text on a jcr:content node :
> {code}A state agency’s Conflict of Interest Code must reflect the current structure of the organization and properly identify officials andemployees{code} 
> On executing the following query :
> {code}//*[jcr:contains(., 'conflict of interest')]/rep:excerpt(.){code}
> we get a row whose excerpt value is having wrong placement of <strong></strong> tags.
> Observed result:
> {code}<div><span>&lt;p&gt;A state agency’s Conflict <strong>of</strong> Interest Code must reflect the current structure <strong>of</strong> the organization and properly identify <strong>of</strong>ficials andemployees&lt;/p&gt;</span></div>{code}
> I don't think it is expected to have {quote}<strong>of</strong>ficials{quote} in the excerpt.
> We get the excerpt value in the following manner :
> org.apache.jackrabbit.oak.jcr.query.RowImpl#getValue("rep:excerpt(" + nodePath + ")")



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)