You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Srikarthik Venkataraman (JIRA)" <ji...@apache.org> on 2009/11/11 14:39:39 UTC

[jira] Commented: (NUTCH-573) Multiple Domains - Query Search

    [ https://issues.apache.org/jira/browse/NUTCH-573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12776451#action_12776451 ] 

Srikarthik Venkataraman commented on NUTCH-573:
-----------------------------------------------

I am very interested in using the Multiterm Query feature for searching in multiple domains. 
Can you please let me know if this patch is tested and available on any of your release builds.

Can we expect this fix to be available in version 1.1 or could you provide us intermediate release.


> Multiple Domains - Query Search
> -------------------------------
>
>                 Key: NUTCH-573
>                 URL: https://issues.apache.org/jira/browse/NUTCH-573
>             Project: Nutch
>          Issue Type: Improvement
>          Components: searcher
>    Affects Versions: 0.9.0
>         Environment: All
>            Reporter: Rajasekar Karthik
>            Assignee: Enis Soztutar
>             Fix For: 1.1
>
>         Attachments: multiTermQuery_v1.patch
>
>
> Searching multiple domains can be done on Lucene - nut not that efficiently on nutch.
> Query:
> +content:"abc" +(site"www.aaa.com" site:"www.bbb.com")
> works on lucene but the same concept does not work on nutch.
> In Lucene, it works with 
> org.apache.lucene.analysis.KeywordAnalyzer
> org.apache.lucene.analysis.standard.StandardAnalyzer 
> but NOT on
> org.apache.lucene.analysis.SimpleAnalyzer 
> Is Nutch analyzer based on SimpleAnalyzer? In this case, is there a workaround to make this work? Is there an option to change what analyzer nutch is using? 
> Just FYI, another solution (inefficient I believe) which seems to be working on nutch
> <query> -site:"ccc.com" -site:"ddd.com" 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.