You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Stefan Neufeind (JIRA)" <ji...@apache.org> on 2006/10/01 16:02:20 UTC
[jira] Created: (NUTCH-377) Add possibility to search for multiple
values
Add possibility to search for multiple values
---------------------------------------------
Key: NUTCH-377
URL: http://issues.apache.org/jira/browse/NUTCH-377
Project: Nutch
Issue Type: Improvement
Components: searcher
Reporter: Stefan Neufeind
Searches with boolean operators (AND or OR) are not (yet) possible. All search-items are always searched with AND.
But it would be nice to have the possibility to allow multiple values for a certain field. Maybe that could done using a separator?
As an example you might want to search for:
someword site:www.example.org|www.apache.org
Which (to my understand) would allow to search for one or more words with a restriction to those two sites. It would prevent having to implement AND and OR fully (maybe even including brackets) but would allow to cover a few often used cases imho.
Easy/hard to do? To my understanding Lucene itself allows AND/OR-searches. So might basically be a problem of string-parsing and query-building towards Lucene?
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (NUTCH-377) Add possibility to search for
multiple values
Posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org>.
[ http://issues.apache.org/jira/browse/NUTCH-377?page=comments#action_12439016 ]
Otis Gospodnetic commented on NUTCH-377:
----------------------------------------
You'd need to modify ./src/java/org/apache/nutch/analysis/NutchAnalysis.jj and regenerate the .java files that produces.
> Add possibility to search for multiple values
> ---------------------------------------------
>
> Key: NUTCH-377
> URL: http://issues.apache.org/jira/browse/NUTCH-377
> Project: Nutch
> Issue Type: Improvement
> Components: searcher
> Reporter: Stefan Neufeind
>
> Searches with boolean operators (AND or OR) are not (yet) possible. All search-items are always searched with AND.
> But it would be nice to have the possibility to allow multiple values for a certain field. Maybe that could done using a separator?
> As an example you might want to search for:
> someword site:www.example.org|www.apache.org
> Which (to my understand) would allow to search for one or more words with a restriction to those two sites. It would prevent having to implement AND and OR fully (maybe even including brackets) but would allow to cover a few often used cases imho.
> Easy/hard to do? To my understanding Lucene itself allows AND/OR-searches. So might basically be a problem of string-parsing and query-building towards Lucene?
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (NUTCH-377) Add possibility to search for
multiple values
Posted by "Stefan Neufeind (JIRA)" <ji...@apache.org>.
[ http://issues.apache.org/jira/browse/NUTCH-377?page=comments#action_12439018 ]
Stefan Neufeind commented on NUTCH-377:
---------------------------------------
Hmm, I'm not too sure I understand how to do that. There is one part which adds prohibited or required phrases but ...
To my understanding isn't the above example parsed "as is" into one string for the whole "site:...|..." ? If yes, could the split be done where evaluating the site-command maybe? Had a look at query-site - but there doesn't seem to be much code over there ...
What is a good syntax that the nutch-community could agree on? And could you maybe wrap up an initial patch for that?
> Add possibility to search for multiple values
> ---------------------------------------------
>
> Key: NUTCH-377
> URL: http://issues.apache.org/jira/browse/NUTCH-377
> Project: Nutch
> Issue Type: Improvement
> Components: searcher
> Reporter: Stefan Neufeind
>
> Searches with boolean operators (AND or OR) are not (yet) possible. All search-items are always searched with AND.
> But it would be nice to have the possibility to allow multiple values for a certain field. Maybe that could done using a separator?
> As an example you might want to search for:
> someword site:www.example.org|www.apache.org
> Which (to my understand) would allow to search for one or more words with a restriction to those two sites. It would prevent having to implement AND and OR fully (maybe even including brackets) but would allow to cover a few often used cases imho.
> Easy/hard to do? To my understanding Lucene itself allows AND/OR-searches. So might basically be a problem of string-parsing and query-building towards Lucene?
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira