You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Luca Cavanna (JIRA)" <ji...@apache.org> on 2013/08/09 15:06:48 UTC

[jira] [Updated] (LUCENE-4906) PostingsHighlighter's PassageFormatter should allow for rendering to arbitrary objects

     [ https://issues.apache.org/jira/browse/LUCENE-4906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Luca Cavanna updated LUCENE-4906:
---------------------------------

    Attachment: LUCENE-4906.patch

I don't see why adding generics would complicate or limit the API. To me it would make it simpler and nicer (not a big change in terms of api itself though).

Attaching a patch with my thoughts to make more concrete what I had in mind, regardless of whether it will be integrated or not.

It's backwards compatible (even though the class is marked experimental): we have an abstract postings highlighter class that does most of the work and returns arbitrary objects (uses generics in order to do so). The PostingsHighlighter is its natural extension that returns String snippets.
 
I updated Mike's new test according to my changes. It should make it easier to understand what's needed to work with arbitrary objects in terms of code using this approach.

I find it more explicit that if you want to extend the abstract one you have to declare what type the formatter is supposed to return, which makes it more explicit and avoids any cast.

Limitations with this approach: 
1) as mentioned before (to me it's more of a benefit) there cannot be heterogeneous types returned by the same highlighter instance.
2) generics don't play well with arrays, thus all the highlight methods that returned arrays are still in the subclass that returns string snippets to keep backwards compatibility. Moving them to the base class would most likely require to return List<FormattedPassage> instead (not backward compatible).

I haven't updated the javadoc yet, but if you like my approach I can go ahead with it.

I would love to hear what you guys think about it. Generics can be scary... but useful sometimes too ;)
                
> PostingsHighlighter's PassageFormatter should allow for rendering to arbitrary objects
> --------------------------------------------------------------------------------------
>
>                 Key: LUCENE-4906
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4906
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>         Attachments: LUCENE-4906.patch, LUCENE-4906.patch
>
>
> For example, in a server, I may want to render the highlight result to JsonObject to send back to the front-end. Today since we render to string, I have to render to JSON string and then re-parse to JsonObject, which is inefficient...
> Or, if (Rob's idea:) we make a query that's like MoreLikeThis but it pulls terms from snippets instead, so you get proximity-influenced salient/expanded terms, then perhaps that renders to just an array of tokens or fragments or something from each snippet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org