You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Emilijan Mirceski (JIRA)" <ji...@apache.org> on 2005/04/17 23:43:57 UTC

[jira] Created: (NUTCH-44) too many search results

too many search results
-----------------------

         Key: NUTCH-44
         URL: http://issues.apache.org/jira/browse/NUTCH-44
     Project: Nutch
        Type: Bug
  Components: web gui  
 Environment: web environment
    Reporter: Emilijan Mirceski


There should be a limitation (user defined) on the number of results the search engine can return. 

For example, if one modifies the seach url as:
http://<my>/search.jsp?query=<some quiery>&hitsPerPage=20000&hitsPerSite=0

The search will try to return 20,000 pages which isn't good for the server side performance. 

Is it possible to have a setting in the config xml files to control this?


Thanks,
Emilijan

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
If you want more information on JIRA, or have a bug to report see:
   http://www.atlassian.com/software/jira


[jira] Commented: (NUTCH-44) too many search results

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/NUTCH-44?page=comments#action_63055 ]
     
Doug Cutting commented on NUTCH-44:
-----------------------------------

I agree.  There should be a limit in the config file.  By default the limit should be 1000 hits.  A patch, anyone?

> too many search results
> -----------------------
>
>          Key: NUTCH-44
>          URL: http://issues.apache.org/jira/browse/NUTCH-44
>      Project: Nutch
>         Type: Bug
>   Components: web gui
>  Environment: web environment
>     Reporter: Emilijan Mirceski

>
> There should be a limitation (user defined) on the number of results the search engine can return. 
> For example, if one modifies the seach url as:
> http://<my>/search.jsp?query=<some quiery>&hitsPerPage=20000&hitsPerSite=0
> The search will try to return 20,000 pages which isn't good for the server side performance. 
> Is it possible to have a setting in the config xml files to control this?
> Thanks,
> Emilijan

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (NUTCH-44) too many search results

Posted by "Sami Siren (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/NUTCH-44?page=comments#action_12364679 ] 

Sami Siren commented on NUTCH-44:
---------------------------------

Byron, have you made any progress with this?

> too many search results
> -----------------------
>
>          Key: NUTCH-44
>          URL: http://issues.apache.org/jira/browse/NUTCH-44
>      Project: Nutch
>         Type: Bug
>   Components: web gui
>  Environment: web environment
>     Reporter: Emilijan Mirceski

>
> There should be a limitation (user defined) on the number of results the search engine can return. 
> For example, if one modifies the seach url as:
> http://<my>/search.jsp?query=<some quiery>&hitsPerPage=20000&hitsPerSite=0
> The search will try to return 20,000 pages which isn't good for the server side performance. 
> Is it possible to have a setting in the config xml files to control this?
> Thanks,
> Emilijan

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (NUTCH-44) too many search results

Posted by "byron miller (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/NUTCH-44?page=comments#action_64078 ]
     
byron miller commented on NUTCH-44:
-----------------------------------

I am working on some code i will submit over the weekend to set a max value for hits per page.

I discovered this to be a serious issue with the opensearch as well since some people were sucking down wayyyyy too many records!

> too many search results
> -----------------------
>
>          Key: NUTCH-44
>          URL: http://issues.apache.org/jira/browse/NUTCH-44
>      Project: Nutch
>         Type: Bug
>   Components: web gui
>  Environment: web environment
>     Reporter: Emilijan Mirceski

>
> There should be a limitation (user defined) on the number of results the search engine can return. 
> For example, if one modifies the seach url as:
> http://<my>/search.jsp?query=<some quiery>&hitsPerPage=20000&hitsPerSite=0
> The search will try to return 20,000 pages which isn't good for the server side performance. 
> Is it possible to have a setting in the config xml files to control this?
> Thanks,
> Emilijan

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (NUTCH-44) too many search results

Posted by "Stefan Neufeind (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/NUTCH-44?page=comments#action_12413155 ] 

Stefan Neufeind commented on NUTCH-44:
--------------------------------------

hi,
any progress on this?

> too many search results
> -----------------------
>
>          Key: NUTCH-44
>          URL: http://issues.apache.org/jira/browse/NUTCH-44
>      Project: Nutch
>         Type: Bug

>   Components: web gui
>  Environment: web environment
>     Reporter: Emilijan Mirceski

>
> There should be a limitation (user defined) on the number of results the search engine can return. 
> For example, if one modifies the seach url as:
> http://<my>/search.jsp?query=<some quiery>&hitsPerPage=20000&hitsPerSite=0
> The search will try to return 20,000 pages which isn't good for the server side performance. 
> Is it possible to have a setting in the config xml files to control this?
> Thanks,
> Emilijan

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira