You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Robert Muir (Updated) (JIRA)" <ji...@apache.org> on 2012/03/02 13:22:57 UTC

[jira] [Updated] (LUCENE-3801) Generify FST shortestPaths() to take a comparator

     [ https://issues.apache.org/jira/browse/LUCENE-3801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir updated LUCENE-3801:
--------------------------------

    Attachment: LUCENE-3801.patch

updated patch with tests (which exposes a real problem!).

I created "paired" versions (Pair<Long,Long> where its weight,output) of the two existing shortestPath tests: the simple one, and the random one. The simple one passes (somehow), but the random one triggers an assertion in shortestPaths

Basically the issue is that the original impl has an optimization: when you pick some minimum path (say 5), you know there is sequence of all zeros leading to some final state, otherwise its not actually the minimum :)

The code takes advantage of this and (fortunately) asserts that it finds this path of all NO_OUTPUTs. 

The problem is if you have a PairOutputs(weight,output), its only NO_OUTPUT if *both* weight and output are also NO_OUTPUT, but this only holds true for the weight... the output has its own separately pushed minimums that don't necessarily correspond...


                
> Generify FST shortestPaths() to take a comparator
> -------------------------------------------------
>
>                 Key: LUCENE-3801
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3801
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/FSTs
>    Affects Versions: 3.6, 4.0
>            Reporter: Robert Muir
>            Assignee: Robert Muir
>         Attachments: LUCENE-3801.patch, LUCENE-3801.patch
>
>
> Not sure we should do this, it costs 5-10% performance for WFSTSuggester.
> But maybe we can optimize something here, or maybe its just no big deal to us.
> Because in general, this could be pretty powerful, e.g. if you needed to store 
> some custom stuff in the suggester, you could use pairoutputs, or whatever.
> And the possibility we might need shortestPaths for other cool things... at the
> least I just wanted to have the patch up here.
> I haven't tested this on pairoutputs... but i've tested it with e.g. FloatOutputs
> and other things and it works fine.
> I tried to minimize the generics violations, there is only 1 (cannot create generic array).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org