You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucy.apache.org by "Marvin Humphrey (Created) (JIRA)" <ji...@apache.org> on 2011/12/13 03:25:31 UTC

[lucy-issues] [jira] [Created] (LUCY-198) LucyX::Search::DedupingSearcher

LucyX::Search::DedupingSearcher
-------------------------------

                 Key: LUCY-198
                 URL: https://issues.apache.org/jira/browse/LUCY-198
             Project: Lucy
          Issue Type: New Feature
            Reporter: Marvin Humphrey


Lucy could use a way to dedupe search results on a specified field.
An old module written for KinoSearch 0.2x, KSx::Search::DedupingSearcher,
provides this functionality.  When someone finds time, it would be useful
to modernize it and publish it as LucyX::Search::DedupingSearcher.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[lucy-issues] [jira] [Updated] (LUCY-198) LucyX::Search::DedupingSearcher

Posted by "Marvin Humphrey (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCY-198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marvin Humphrey updated LUCY-198:
---------------------------------

    Attachment: 01-dedup.t
                DedupingSearcher.pm
    
> LucyX::Search::DedupingSearcher
> -------------------------------
>
>                 Key: LUCY-198
>                 URL: https://issues.apache.org/jira/browse/LUCY-198
>             Project: Lucy
>          Issue Type: New Feature
>            Reporter: Marvin Humphrey
>         Attachments: 01-dedup.t, DedupingSearcher.pm
>
>
> Lucy could use a way to dedupe search results on a specified field.
> An old module written for KinoSearch 0.2x, KSx::Search::DedupingSearcher,
> provides this functionality.  When someone finds time, it would be useful
> to modernize it and publish it as LucyX::Search::DedupingSearcher.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[lucy-issues] [jira] [Commented] (LUCY-198) LucyX::Search::DedupingSearcher

Posted by "Marvin Humphrey (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCY-198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13168082#comment-13168082 ] 

Marvin Humphrey commented on LUCY-198:
--------------------------------------

A few notes for whoever might want to work on this:

* It would probably be best to start off by getting the test file
  to compile and at least build the index.
* BooleanQuery doesn't exist anymore.  You'll want to use ANDQuery
  and NOTQuery.
* Other LucyX classes don't use refaddr for inside-out object keys, they
  use $$self.  Take a look at other modules under trunk/perl/lib/LucyX
  for examples.
* This module was written in 2006-2007.  Sorting was completely
  overhauled during 2008-2010.  Things will need to change and not
  always in obvious ways, as Lucy's modern sorting API is not public.
* The algorithm which this module is based on is sound, but the code
  may be buggy.

Contact the dev list if you're interested in working up a patch.
                
> LucyX::Search::DedupingSearcher
> -------------------------------
>
>                 Key: LUCY-198
>                 URL: https://issues.apache.org/jira/browse/LUCY-198
>             Project: Lucy
>          Issue Type: New Feature
>            Reporter: Marvin Humphrey
>         Attachments: 01-dedup.t, DedupingSearcher.pm
>
>
> Lucy could use a way to dedupe search results on a specified field.
> An old module written for KinoSearch 0.2x, KSx::Search::DedupingSearcher,
> provides this functionality.  When someone finds time, it would be useful
> to modernize it and publish it as LucyX::Search::DedupingSearcher.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira