You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucy.apache.org by "Nick Wellnhofer (JIRA)" <ji...@apache.org> on 2013/10/14 01:16:42 UTC
[lucy-issues] [jira] [Resolved] (LUCY-199) Highlighting/excerpt on URLs
[ https://issues.apache.org/jira/browse/LUCY-199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nick Wellnhofer resolved LUCY-199.
----------------------------------
Resolution: Fixed
Fix Version/s: 0.4.0
After rewriting the Highlighter code to use string iterators, the test case produces:
{noformat}
$ ./search statue
… ckens-gets-a-<strong>statue</strong>-1.1130220
$ ./search dickens
… oks/what-the-<strong>dickens</strong>-gets-a-statue-1.
{noformat}
This could be improved by breaking at non-alphabetic characters, but I think it's good enough to resolve this issue.
(Note that the test case file 'runme' has to be changed to use a FullTextType with 'highlightable => 1'.)
> Highlighting/excerpt on URLs
> -----------------------------
>
> Key: LUCY-199
> URL: https://issues.apache.org/jira/browse/LUCY-199
> Project: Lucy
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.2.2 (incubating)
> Environment: Linux
> Reporter: Henry
> Fix For: 0.4.0
>
> Attachments: hltest1.tgz, LUCY-199-quickfix.patch
>
>
> If I explicitly specify excerpt_length:
> my $hl = Lucy::Highlight::Highlighter->new(
> searcher => $searcher,
> query => $query_compiler,
> field => 'site',
> excerpt_length => 60,
> );
> ...and the field content is longer than 60, then
> $page_highlighter->create_excerpt($hit);
> returns '...'.
> Content which is short than 60, returns the highlighted excerpt as expected.
> If I comment out "excerpt_length => 60," above, then it returns the full
> non-truncated excerpt with highlighting as expected.
> Some >60char samples which return …/"...", searching for [iol.co.za] or
> [news24.com] (brackets are mine):
> [www.iol.co.za/tonight/books/what-the-dickens-gets-a-statue-1.1130220]
> [http://www.news24.com/News24v2/Travel/Mini_Site/ContentDisplay/n24TravelMiniSiteHome/0,,,00.html]
> [www.news24.com/News24v2/Travel/Mini_Site/ContentDisplay/n24TravelMiniSite_TravelClub/0,,,00.html]
> The following return double-ellipses ("......" - ……), searching
> for [adsl mweb.com]:
> [http://www.mweb.co.za/helpcentre/ADSL/ADSLGeneralIdisagreewithyourusagereport.aspx]
> [http://www.mweb.co.za/helpcentre/FrequentlyAskedQuestions/MWEBHelpCentreFAQsHowdoI/FAQHowdoIHowdoImigratemyADSL/tabid/661/Default.aspx]
--
This message was sent by Atlassian JIRA
(v6.1#6144)