You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "julien nioche (JIRA)" <ji...@apache.org> on 2007/10/01 18:24:50 UTC

[jira] Created: (NUTCH-563) Include custom fields in BasicQueryFilter

Include custom fields in BasicQueryFilter
-----------------------------------------

                 Key: NUTCH-563
                 URL: https://issues.apache.org/jira/browse/NUTCH-563
             Project: Nutch
          Issue Type: New Feature
          Components: searcher
            Reporter: julien nioche
            Priority: Minor
             Fix For: 0.9.0
         Attachments: diff.BasicQueryFilter.dynamicFields.txt

This patch allows to include additional fields in the BasicQueryFilter by specifying runtime parameters.  Any parameter matching the regular expression (query\\.basic\\.(.+)\\.boost") will be added to the list of fields to be used by the BQF and the specified float value will be used as boost.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (NUTCH-563) Include custom fields in BasicQueryFilter

Posted by "Doğacan Güney (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12532030 ] 

Doğacan Güney commented on NUTCH-563:
-------------------------------------

Why not just write a new plugin for new fields? I guess this is a bit simpler but IMHO, it is much more clean to write a new plugin.

PS: Please send your diffs in unified format (diff -u). You can also generate them with "svn diff".

> Include custom fields in BasicQueryFilter
> -----------------------------------------
>
>                 Key: NUTCH-563
>                 URL: https://issues.apache.org/jira/browse/NUTCH-563
>             Project: Nutch
>          Issue Type: New Feature
>          Components: searcher
>            Reporter: julien nioche
>            Priority: Minor
>             Fix For: 0.9.0
>
>         Attachments: diff.BasicQueryFilter.dynamicFields.txt
>
>
> This patch allows to include additional fields in the BasicQueryFilter by specifying runtime parameters.  Any parameter matching the regular expression (query\\.basic\\.(.+)\\.boost") will be added to the list of fields to be used by the BQF and the specified float value will be used as boost.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (NUTCH-563) Include custom fields in BasicQueryFilter

Posted by "Jasper Kamperman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12651208#action_12651208 ] 

Jasper Kamperman commented on NUTCH-563:
----------------------------------------

Hi Davide,

My laptop which has nutch-0.9 on it is in the shop so I can't verify where that file is, but I think it is altogether possible that nutch-0.8 doesn't yet have a file BasicQueryFilter.java .

Sorry I can't be of more help. I'm CC'ing the original author of the patch, but he just became Father, so it might be a while until you hear from him :-).

Jasper



> Include custom fields in BasicQueryFilter
> -----------------------------------------
>
>                 Key: NUTCH-563
>                 URL: https://issues.apache.org/jira/browse/NUTCH-563
>             Project: Nutch
>          Issue Type: New Feature
>          Components: searcher
>            Reporter: julien nioche
>            Priority: Minor
>             Fix For: 0.9.0
>
>         Attachments: diff.BasicQueryFilter.dynamicFields.txt
>
>
> This patch allows to include additional fields in the BasicQueryFilter by specifying runtime parameters.  Any parameter matching the regular expression (query\\.basic\\.(.+)\\.boost") will be added to the list of fields to be used by the BQF and the specified float value will be used as boost.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (NUTCH-563) Include custom fields in BasicQueryFilter

Posted by "Jasper Kamperman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12651093#action_12651093 ] 

Jasper Kamperman commented on NUTCH-563:
----------------------------------------

Hi Davide,

I never tried to apply it to 0.8, sorry.

Jasper



> Include custom fields in BasicQueryFilter
> -----------------------------------------
>
>                 Key: NUTCH-563
>                 URL: https://issues.apache.org/jira/browse/NUTCH-563
>             Project: Nutch
>          Issue Type: New Feature
>          Components: searcher
>            Reporter: julien nioche
>            Priority: Minor
>             Fix For: 0.9.0
>
>         Attachments: diff.BasicQueryFilter.dynamicFields.txt
>
>
> This patch allows to include additional fields in the BasicQueryFilter by specifying runtime parameters.  Any parameter matching the regular expression (query\\.basic\\.(.+)\\.boost") will be added to the list of fields to be used by the BQF and the specified float value will be used as boost.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (NUTCH-563) Include custom fields in BasicQueryFilter

Posted by "julien nioche (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/NUTCH-563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

julien nioche updated NUTCH-563:
--------------------------------

    Attachment: NUTCH-563.patch

Updated the original patch + added class level javadoc comment + example in nutch-default.xml

> Include custom fields in BasicQueryFilter
> -----------------------------------------
>
>                 Key: NUTCH-563
>                 URL: https://issues.apache.org/jira/browse/NUTCH-563
>             Project: Nutch
>          Issue Type: New Feature
>          Components: searcher
>            Reporter: julien nioche
>            Priority: Minor
>             Fix For: 1.0.0
>
>         Attachments: diff.BasicQueryFilter.dynamicFields.txt, NUTCH-563.patch
>
>
> This patch allows to include additional fields in the BasicQueryFilter by specifying runtime parameters.  Any parameter matching the regular expression (query\\.basic\\.(.+)\\.boost") will be added to the list of fields to be used by the BQF and the specified float value will be used as boost.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (NUTCH-563) Include custom fields in BasicQueryFilter

Posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/NUTCH-563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Otis Gospodnetic updated NUTCH-563:
-----------------------------------

    Fix Version/s:     (was: 0.9.0)
                   1.0.0

> Include custom fields in BasicQueryFilter
> -----------------------------------------
>
>                 Key: NUTCH-563
>                 URL: https://issues.apache.org/jira/browse/NUTCH-563
>             Project: Nutch
>          Issue Type: New Feature
>          Components: searcher
>            Reporter: julien nioche
>            Priority: Minor
>             Fix For: 1.0.0
>
>         Attachments: diff.BasicQueryFilter.dynamicFields.txt
>
>
> This patch allows to include additional fields in the BasicQueryFilter by specifying runtime parameters.  Any parameter matching the regular expression (query\\.basic\\.(.+)\\.boost") will be added to the list of fields to be used by the BQF and the specified float value will be used as boost.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (NUTCH-563) Include custom fields in BasicQueryFilter

Posted by "julien nioche (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/NUTCH-563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

julien nioche updated NUTCH-563:
--------------------------------

    Attachment: diff.BasicQueryFilter.dynamicFields.txt

> Include custom fields in BasicQueryFilter
> -----------------------------------------
>
>                 Key: NUTCH-563
>                 URL: https://issues.apache.org/jira/browse/NUTCH-563
>             Project: Nutch
>          Issue Type: New Feature
>          Components: searcher
>            Reporter: julien nioche
>            Priority: Minor
>             Fix For: 0.9.0
>
>         Attachments: diff.BasicQueryFilter.dynamicFields.txt
>
>
> This patch allows to include additional fields in the BasicQueryFilter by specifying runtime parameters.  Any parameter matching the regular expression (query\\.basic\\.(.+)\\.boost") will be added to the list of fields to be used by the BQF and the specified float value will be used as boost.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (NUTCH-563) Include custom fields in BasicQueryFilter

Posted by "Jasper Kamperman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12651425#action_12651425 ] 

Jasper Kamperman commented on NUTCH-563:
----------------------------------------

Hi Davide,

If I read the patch comments correctly,   once you've applied the  
patch and recompiled, you need to add entries of the form

query.basic.yourfieldname.someboost

into an XML configuration file that specifies which field should be  
included and what boost they should have. I don't think you need to  
change the code beyond applying the patch. By the way, if you like the  
patch, please vote for it :-)

Sent from my iPhone




> Include custom fields in BasicQueryFilter
> -----------------------------------------
>
>                 Key: NUTCH-563
>                 URL: https://issues.apache.org/jira/browse/NUTCH-563
>             Project: Nutch
>          Issue Type: New Feature
>          Components: searcher
>            Reporter: julien nioche
>            Priority: Minor
>             Fix For: 0.9.0
>
>         Attachments: diff.BasicQueryFilter.dynamicFields.txt
>
>
> This patch allows to include additional fields in the BasicQueryFilter by specifying runtime parameters.  Any parameter matching the regular expression (query\\.basic\\.(.+)\\.boost") will be added to the list of fields to be used by the BQF and the specified float value will be used as boost.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (NUTCH-563) Include custom fields in BasicQueryFilter

Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12671124#action_12671124 ] 

Andrzej Bialecki  commented on NUTCH-563:
-----------------------------------------

I'd like to include this functionality in 1.0, but the patch doesn't document this in any way. Could you please add a bit of documentation (class-level javadoc, plus a commented-out example in nutch-default.xml)? Thanks.

> Include custom fields in BasicQueryFilter
> -----------------------------------------
>
>                 Key: NUTCH-563
>                 URL: https://issues.apache.org/jira/browse/NUTCH-563
>             Project: Nutch
>          Issue Type: New Feature
>          Components: searcher
>            Reporter: julien nioche
>            Priority: Minor
>             Fix For: 1.0.0
>
>         Attachments: diff.BasicQueryFilter.dynamicFields.txt
>
>
> This patch allows to include additional fields in the BasicQueryFilter by specifying runtime parameters.  Any parameter matching the regular expression (query\\.basic\\.(.+)\\.boost") will be added to the list of fields to be used by the BQF and the specified float value will be used as boost.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (NUTCH-563) Include custom fields in BasicQueryFilter

Posted by "julien nioche (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12532034 ] 

julien nioche commented on NUTCH-563:
-------------------------------------

As I explained in my message to the dev-list, having a separate plugin for the new fields is not perfect. The BasicQueryFilter generates subclauses for each term of the user query where it searches for a match on any of the fields, e.g. for a user query  A B, it generates +(field1:A field2:A ...)  +(field1:B field2:B....). If you want to add a new field in a separate plugin you'd have to parse the clauses generated by the BQF, assume that there is a subclause per original term and modify the subclause to add the field, which is not great as it relies on assumptions on the output query generated by the BQF and that this query has not been modified by another query filter in the meantime.

It would be good to be able to remove the BasicQueryFilter altogether and replace it with a custom version if necessary, but apparently you have to have it in the list of active plugins. 


> Include custom fields in BasicQueryFilter
> -----------------------------------------
>
>                 Key: NUTCH-563
>                 URL: https://issues.apache.org/jira/browse/NUTCH-563
>             Project: Nutch
>          Issue Type: New Feature
>          Components: searcher
>            Reporter: julien nioche
>            Priority: Minor
>             Fix For: 0.9.0
>
>         Attachments: diff.BasicQueryFilter.dynamicFields.txt
>
>
> This patch allows to include additional fields in the BasicQueryFilter by specifying runtime parameters.  Any parameter matching the regular expression (query\\.basic\\.(.+)\\.boost") will be added to the list of fields to be used by the BQF and the specified float value will be used as boost.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (NUTCH-563) Include custom fields in BasicQueryFilter

Posted by "Sami Siren (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/NUTCH-563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sami Siren resolved NUTCH-563.
------------------------------

    Resolution: Fixed
      Assignee: Sami Siren

committed, thanks

> Include custom fields in BasicQueryFilter
> -----------------------------------------
>
>                 Key: NUTCH-563
>                 URL: https://issues.apache.org/jira/browse/NUTCH-563
>             Project: Nutch
>          Issue Type: New Feature
>          Components: searcher
>            Reporter: julien nioche
>            Assignee: Sami Siren
>            Priority: Minor
>             Fix For: 1.0.0
>
>         Attachments: diff.BasicQueryFilter.dynamicFields.txt, NUTCH-563.patch
>
>
> This patch allows to include additional fields in the BasicQueryFilter by specifying runtime parameters.  Any parameter matching the regular expression (query\\.basic\\.(.+)\\.boost") will be added to the list of fields to be used by the BQF and the specified float value will be used as boost.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (NUTCH-563) Include custom fields in BasicQueryFilter

Posted by "Davide (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12651325#action_12651325 ] 

Davide commented on NUTCH-563:
------------------------------

Hi Jasper,

I've found the file BasicQueryFilter.java in Nutch-0.8.1, and after comparing it with the one in Nutch-0.9, I didn't find any difference. So I think I can apply the patch also on the Nutch-0.8.1 . 
I must put the additional field name in

"phrase"

if (fieldName.equals("phrase")) continue;

am I right?

Do you think I can insert there a generic name like "wildcard" and after put inside this the wildcard expression, like *, ? ecc?
I know Lucene can use wildcards, but Nutch not.

Thanks a lot!

Davide

> Include custom fields in BasicQueryFilter
> -----------------------------------------
>
>                 Key: NUTCH-563
>                 URL: https://issues.apache.org/jira/browse/NUTCH-563
>             Project: Nutch
>          Issue Type: New Feature
>          Components: searcher
>            Reporter: julien nioche
>            Priority: Minor
>             Fix For: 0.9.0
>
>         Attachments: diff.BasicQueryFilter.dynamicFields.txt
>
>
> This patch allows to include additional fields in the BasicQueryFilter by specifying runtime parameters.  Any parameter matching the regular expression (query\\.basic\\.(.+)\\.boost") will be added to the list of fields to be used by the BQF and the specified float value will be used as boost.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (NUTCH-563) Include custom fields in BasicQueryFilter

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12674885#action_12674885 ] 

Hudson commented on NUTCH-563:
------------------------------

Integrated in Nutch-trunk #729 (See [http://hudson.zones.apache.org/hudson/job/Nutch-trunk/729/])
     Include custom fields in BasicQueryFilter, contributed by Julien Nioche


> Include custom fields in BasicQueryFilter
> -----------------------------------------
>
>                 Key: NUTCH-563
>                 URL: https://issues.apache.org/jira/browse/NUTCH-563
>             Project: Nutch
>          Issue Type: New Feature
>          Components: searcher
>            Reporter: julien nioche
>            Assignee: Sami Siren
>            Priority: Minor
>             Fix For: 1.0.0
>
>         Attachments: diff.BasicQueryFilter.dynamicFields.txt, NUTCH-563.patch
>
>
> This patch allows to include additional fields in the BasicQueryFilter by specifying runtime parameters.  Any parameter matching the regular expression (query\\.basic\\.(.+)\\.boost") will be added to the list of fields to be used by the BQF and the specified float value will be used as boost.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (NUTCH-563) Include custom fields in BasicQueryFilter

Posted by "Davide (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12651152#action_12651152 ] 

Davide commented on NUTCH-563:
------------------------------

Hi Jasper,

could you explain me how to apply it? I can't find the right file to apply the diff..

Thank you a lot!

> Include custom fields in BasicQueryFilter
> -----------------------------------------
>
>                 Key: NUTCH-563
>                 URL: https://issues.apache.org/jira/browse/NUTCH-563
>             Project: Nutch
>          Issue Type: New Feature
>          Components: searcher
>            Reporter: julien nioche
>            Priority: Minor
>             Fix For: 0.9.0
>
>         Attachments: diff.BasicQueryFilter.dynamicFields.txt
>
>
> This patch allows to include additional fields in the BasicQueryFilter by specifying runtime parameters.  Any parameter matching the regular expression (query\\.basic\\.(.+)\\.boost") will be added to the list of fields to be used by the BQF and the specified float value will be used as boost.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (NUTCH-563) Include custom fields in BasicQueryFilter

Posted by "Beaucarnea (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12655679#action_12655679 ] 

Beaucarnea commented on NUTCH-563:
----------------------------------

I applied the patch on my dev-1.0 version, but had to change one line in method findAdditionalFields(Configuration conf):
Iterator confEntriesIterator = conf.entries();      
changed to   
Iterator confEntriesIterator = conf.iterator();

Then, it worked great for me.
Thanks!
Martina


> Include custom fields in BasicQueryFilter
> -----------------------------------------
>
>                 Key: NUTCH-563
>                 URL: https://issues.apache.org/jira/browse/NUTCH-563
>             Project: Nutch
>          Issue Type: New Feature
>          Components: searcher
>            Reporter: julien nioche
>            Priority: Minor
>             Fix For: 1.0.0
>
>         Attachments: diff.BasicQueryFilter.dynamicFields.txt
>
>
> This patch allows to include additional fields in the BasicQueryFilter by specifying runtime parameters.  Any parameter matching the regular expression (query\\.basic\\.(.+)\\.boost") will be added to the list of fields to be used by the BQF and the specified float value will be used as boost.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (NUTCH-563) Include custom fields in BasicQueryFilter

Posted by "Davide (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12651009#action_12651009 ] 

Davide commented on NUTCH-563:
------------------------------

Hi,
is it possible to apply this code also on Nutch 0.8.1? Can you explain me how?

Thanks

> Include custom fields in BasicQueryFilter
> -----------------------------------------
>
>                 Key: NUTCH-563
>                 URL: https://issues.apache.org/jira/browse/NUTCH-563
>             Project: Nutch
>          Issue Type: New Feature
>          Components: searcher
>            Reporter: julien nioche
>            Priority: Minor
>             Fix For: 0.9.0
>
>         Attachments: diff.BasicQueryFilter.dynamicFields.txt
>
>
> This patch allows to include additional fields in the BasicQueryFilter by specifying runtime parameters.  Any parameter matching the regular expression (query\\.basic\\.(.+)\\.boost") will be added to the list of fields to be used by the BQF and the specified float value will be used as boost.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.