You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by "Matthias W." <Ma...@e-projecta.com> on 2008/10/15 11:47:50 UTC
Using Nutch for crawling and Lucene for searching (Wildcard/Fuzzy)
Hi,
I want to use Nutch for crawling contents and Lucene webapp to search the
Nutch-created index.
I thought nutch creates a Lucene interoperable index, but when I'm searching
the index with the Lucene webapp I get no results.
I'm using Nutch 0.9 and Lucene 2.4.0.
Should I use an older Lucene version like 2.0 or is this not crucial?
I want to use Lucene, because of its Wildcardsearch and Fuzzysearch, ...
Are there other possibilities to solve this?
--
View this message in context: http://www.nabble.com/Using-Nutch-for-crawling-and-Lucene-for-searching-%28Wildcard-Fuzzy%29-tp19990219p19990219.html
Sent from the Nutch - User mailing list archive at Nabble.com.
RE: Using Nutch for crawling and Lucene for searching
(Wildcard/Fuzzy)
Posted by "Matthias W." <Ma...@e-projecta.com>.
Patrick Markiewicz wrote:
>
> I'm not sure what you're using for searching, but wherever you
> reference an analyzer in Lucene, you need to change that from
> StandardAnalyzer to
> AnalyzerFactory.get(NutchConfiguration.create().get("en")) (which may
> require importing nutch-specific classes).
>
I changed:
Analyzer analyzer = new StandardAnalyzer();
to:
Configuration nutchConfig = NutchConfiguration.create();
AnalyzerFactory an = new AnalyzerFactory(nutchConfig);
NutchAnalyzer analyzer = an.get(nutchConfig.get("en"));
now I get following error message from tomcat:
org.apache.jasper.JasperException
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:372)
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:292)
org.apache.jasper.servlet.JspServlet.service(JspServlet.java:236)
javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
sun.reflect.GeneratedMethodAccessor52.invoke(Unknown Source)
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
java.lang.reflect.Method.invoke(Method.java:585)
org.apache.catalina.security.SecurityUtil$1.run(SecurityUtil.java:243)
java.security.AccessController.doPrivileged(Native Method)
javax.security.auth.Subject.doAsPrivileged(Subject.java:517)
org.apache.catalina.security.SecurityUtil.execute(SecurityUtil.java:272)
org.apache.catalina.security.SecurityUtil.doAsPrivilege(SecurityUtil.java:161)
root cause
java.lang.NullPointerException
java.io.Reader.<init>(Reader.java:61)
java.io.BufferedReader.<init>(BufferedReader.java:76)
java.io.BufferedReader.<init>(BufferedReader.java:91)
org.apache.nutch.analysis.CommonGrams.init(CommonGrams.java:152)
org.apache.nutch.analysis.CommonGrams.<init>(CommonGrams.java:52)
org.apache.nutch.analysis.NutchDocumentAnalyzer$ContentAnalyzer.<init>(NutchDocumentAnalyzer.java:64)
org.apache.nutch.analysis.NutchDocumentAnalyzer.<init>(NutchDocumentAnalyzer.java:55)
org.apache.nutch.analysis.AnalyzerFactory.<init>(AnalyzerFactory.java:49)
org.apache.jsp.results_jsp._jspService(results_jsp.java:167)
org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:94)
javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:324)
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:292)
org.apache.jasper.servlet.JspServlet.service(JspServlet.java:236)
javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
sun.reflect.GeneratedMethodAccessor52.invoke(Unknown Source)
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
java.lang.reflect.Method.invoke(Method.java:585)
org.apache.catalina.security.SecurityUtil$1.run(SecurityUtil.java:243)
java.security.AccessController.doPrivileged(Native Method)
javax.security.auth.Subject.doAsPrivileged(Subject.java:517)
org.apache.catalina.security.SecurityUtil.execute(SecurityUtil.java:272)
org.apache.catalina.security.SecurityUtil.doAsPrivilege(SecurityUtil.java:161)
Full Sourcecode of results.jsp:
<%@ page import="org.apache.hadoop.conf.*"
import="org.apache.nutch.util.NutchConfiguration"
import="org.apache.nutch.analysis.*"
import = " javax.servlet.*, javax.servlet.http.*, java.io.*,
org.apache.lucene.document.*, org.apache.lucene.index.*,
org.apache.lucene.search.*, org.apache.lucene.queryParser.*,
org.apache.lucene.demo.*, org.apache.lucene.demo.html.Entities,
java.net.URLEncoder"
%>
<%
/*
Author: Andrew C. Oliver, SuperLink Software, Inc.
(acoliver2@users.sourceforge.net)
This jsp page is deliberatly written in the horrible java directly
embedded
in the page style for an easy and concise demonstration of Lucene.
Due note...if you write pages that look like this...sooner or later
you'll have a maintenance nightmare. If you use jsps...use taglibs
and beans! That being said, this should be acceptable for a small
page demonstrating how one uses Lucene in a web app.
This is also deliberately overcommented. ;-)
*/
%>
<%!
public String escapeHTML(String s) {
s = s.replaceAll("&", "&");
s = s.replaceAll("<", "<");
s = s.replaceAll(">", ">");
s = s.replaceAll("\"", """);
s = s.replaceAll("'", "'");
return s;
}
%>
<%@include file="header.jsp"%>
<%
boolean error = false; //used to control flow for
error messages
String indexName = indexLocation; //local copy of the
configuration variable
IndexSearcher searcher = null; //the searcher used to
open/search the index
Query query = null; //the Query created by the
QueryParser
Hits hits = null; //the search results
int startindex = 0; //the first index displayed
on this page
int maxpage = 50; //the maximum items
displayed on this page
String queryString = null; //the query entered in the
previous page
String startVal = null; //string version of
startindex
String maxresults = null; //string version of maxpage
int thispage = 0; //used for the for/next
either maxpage or
//hits.length() - startindex
- whichever is
//less
try {
searcher = new IndexSearcher(indexName); //create an
indexSearcher for our page
//NOTE: this
operation is slow for large
//indices (much
slower than the search itself)
//so you might want
to keep an IndexSearcher
//open
} catch (Exception e) { //any error that
happens is probably due
//to a permission
problem or non-existant
//or otherwise
corrupt index
%>
<p>ERROR opening the Index - contact sysadmin!</p>
<p>Error message: <%=escapeHTML(e.getMessage())%></p>
<% error = true; //don't do
anything up to the footer
}
%>
<%
if (error == false) { //did
we open the index?
queryString = request.getParameter("query"); //get
the search criteria
startVal = request.getParameter("startat"); //get
the start index
maxresults = request.getParameter("maxresults"); //get
max results per page
try {
maxpage = Integer.parseInt(maxresults);
//parse the max results first
startindex = Integer.parseInt(startVal); //then
the start index
} catch (Exception e) { } //we don't care if something
happens we'll just start at 0
//or end at 50
if (queryString == null)
throw new ServletException("no query "+ //if
you don't have a query then
"specified"); //you
probably played on the
//query string so you get the
Configuration nutchConfig = NutchConfiguration.create();
//treatment
AnalyzerFactory an = new AnalyzerFactory(nutchConfig);
NutchAnalyzer analyzer = an.get(nutchConfig.get("en"));
//construct our usual analyzer
try {
QueryParser qp = new QueryParser("contents",
analyzer);
query = qp.parse(queryString); //parse the
} catch (ParseException e) {
//query and construct the Query
//object
//if
it's just "operator error"
//send
them a nice error HTML
%>
<p>Error while parsing query:
<%=escapeHTML(e.getMessage())%></p>
<%
error = true;
//don't bother with the rest of
//the
page
}
}
%>
<%
if (error == false && searcher != null) { // if
we've had no errors
//
searcher != null was to handle
// a
weird compilation bug
thispage = maxpage; //
default last element to maxpage
hits = searcher.search(query); // run
the query
if (hits.length() == 0) { // if
we got no results tell the user
%>
<p> I'm sorry I couldn't find what you were looking for. </p>
<%
error = true; // don't bother
with the rest of the
// page
}
}
if (error == false && searcher != null) {
%>
<table>
<tr>
<td>Document</td>
<td>Summary</td>
</tr>
<%
if ((startindex + maxpage) > hits.length()) {
thispage = hits.length() - startindex; // set
the max index to maxpage or last
} //
actual search result whichever is less
for (int i = startindex; i < (thispage + startindex); i++) {
// for each element
%>
<tr>
<%
Document doc = hits.doc(i); //get
the next document
String doctitle = doc.get("title"); //get
its title
String url = doc.get("path"); //get
its path field
if (url != null && url.startsWith("../webapps/")) {
// strip off ../webapps prefix if present
url = url.substring(10);
}
if ((doctitle == null) || doctitle.equals("")) //use
the path if it has no title
doctitle = url;
//then output!
%>
<td> "<%=url% "><%=doctitle%> </td>
<td><%=doc.get("summary")%></td>
</tr>
<%
}
%>
<% if ( (startindex + maxpage) < hits.length()) { //if
there are more results...display
//the
more link
String moreurl="results.jsp?query=" +
URLEncoder.encode(queryString) +
//construct the "more" link
"&maxresults=" + maxpage +
"&startat=" + (startindex +
maxpage);
%>
<tr>
<td></td><td> "<%=moreurl% ">More Results>> </td>
</tr>
<%
}
%>
</table>
<% } //then include our
footer.
if (searcher != null)
searcher.close();
%>
<%@include file="footer.jsp"%>
What can I do now?
--
View this message in context: http://www.nabble.com/Using-Nutch-for-crawling-and-Lucene-for-searching-%28Wildcard-Fuzzy%29-tp19990219p20303116.html
Sent from the Nutch - User mailing list archive at Nabble.com.
nutch OR again
Posted by Christopher Condit <co...@sdsc.edu>.
I've got nutch 0.9 and applied the patch from 479 as described here:
http://thethoughtlab.blogspot.com/2007/06/adding-non-required-term-patch
-to-nutch.html
However, I'm still not seeing functioning OR queries when using the
nutch web application or the NutchBean from the command line...
1) Am I missing something else regarding the application of this patch?
2) Does the SVN trunk incorporate this patch? Should I try that?
Thanks,
-Chris
RE: Using Nutch for crawling and Lucene for searching (Wildcard/Fuzzy)
Posted by Patrick Markiewicz <pm...@sim-gtech.com>.
Hi Matt,
If you read the Lucene documentation you will discover that the
Analyzer used for searching needs to be the same type that indexed the
content. I'm not sure what you're using for searching, but wherever you
reference an analyzer in Lucene, you need to change that from
StandardAnalyzer to
AnalyzerFactory.get(NutchConfiguration.create().get("en")) (which may
require importing nutch-specific classes). In order to display the URL,
you need to reference the "url" field as opposed to the "path" field
that Lucene uses initially. Use Luke to see what field stores the
content of the URL. That may have to change from "content" to
"contents".
To be honest, I never tried just changing the "path" field to
"url". You could try that and see if the StandardAnalyzer would work,
but I don't have enough knowledge about the analyzers to know if that
would work.
Patrick
-----Original Message-----
From: Matthias W. [mailto:Matthias.Wangler@e-projecta.com]
Sent: Wednesday, October 15, 2008 6:22 AM
To: nutch-user@lucene.apache.org
Subject: Re: Using Nutch for crawling and Lucene for searching
(Wildcard/Fuzzy)
Thanks, but what does this mean for me?
I already tried to search the index with the Lucene webapp
(lucenewebapp.war
from Lucene package) including my nutch index 'nutchcrawl/index' and
'nutchcrawl/indexes/part-00000' but with both of them I get no results.
And my index is correct, because with Luke and the nutch webapp I get
results.
Andrzej Bialecki wrote:
>
> Matthias W. wrote:
>> Hi,
>> I want to use Nutch for crawling contents and Lucene webapp to search
the
>> Nutch-created index.
>> I thought nutch creates a Lucene interoperable index, but when I'm
>> searching
>> the index with the Lucene webapp I get no results.
>> I'm using Nutch 0.9 and Lucene 2.4.0.
>> Should I use an older Lucene version like 2.0 or is this not crucial?
>>
>> I want to use Lucene, because of its Wildcardsearch and Fuzzysearch,
...
>> Are there other possibilities to solve this?
>
> Nutch indexes are plain Lucene indexes. The only difference is that as
a
> side-effect of map-reduce processing these indexes may come in several
> parts, found in subdirectories named like part-xxxxx. Each
subdirectory
> holds a valid Lucene index.
>
> --
> Best regards,
> Andrzej Bialecki <><
> ___. ___ ___ ___ _ _ __________________________________
> [__ || __|__/|__||\/| Information Retrieval, Semantic Web
> ___|||__|| \| || | Embedded Unix, System Integration
> http://www.sigram.com Contact: info at sigram dot com
>
>
>
--
View this message in context:
http://www.nabble.com/Using-Nutch-for-crawling-and-Lucene-for-searching-
%28Wildcard-Fuzzy%29-tp19990219p19990671.html
Sent from the Nutch - User mailing list archive at Nabble.com.
Re: Using Nutch for crawling and Lucene for searching
(Wildcard/Fuzzy)
Posted by "Matthias W." <Ma...@e-projecta.com>.
Thanks, but what does this mean for me?
I already tried to search the index with the Lucene webapp (lucenewebapp.war
from Lucene package) including my nutch index 'nutchcrawl/index' and
'nutchcrawl/indexes/part-00000' but with both of them I get no results.
And my index is correct, because with Luke and the nutch webapp I get
results.
Andrzej Bialecki wrote:
>
> Matthias W. wrote:
>> Hi,
>> I want to use Nutch for crawling contents and Lucene webapp to search the
>> Nutch-created index.
>> I thought nutch creates a Lucene interoperable index, but when I'm
>> searching
>> the index with the Lucene webapp I get no results.
>> I'm using Nutch 0.9 and Lucene 2.4.0.
>> Should I use an older Lucene version like 2.0 or is this not crucial?
>>
>> I want to use Lucene, because of its Wildcardsearch and Fuzzysearch, ...
>> Are there other possibilities to solve this?
>
> Nutch indexes are plain Lucene indexes. The only difference is that as a
> side-effect of map-reduce processing these indexes may come in several
> parts, found in subdirectories named like part-xxxxx. Each subdirectory
> holds a valid Lucene index.
>
> --
> Best regards,
> Andrzej Bialecki <><
> ___. ___ ___ ___ _ _ __________________________________
> [__ || __|__/|__||\/| Information Retrieval, Semantic Web
> ___|||__|| \| || | Embedded Unix, System Integration
> http://www.sigram.com Contact: info at sigram dot com
>
>
>
--
View this message in context: http://www.nabble.com/Using-Nutch-for-crawling-and-Lucene-for-searching-%28Wildcard-Fuzzy%29-tp19990219p19990671.html
Sent from the Nutch - User mailing list archive at Nabble.com.
Re: Using Nutch for crawling and Lucene for searching (Wildcard/Fuzzy)
Posted by Andrzej Bialecki <ab...@getopt.org>.
Matthias W. wrote:
> Hi,
> I want to use Nutch for crawling contents and Lucene webapp to search the
> Nutch-created index.
> I thought nutch creates a Lucene interoperable index, but when I'm searching
> the index with the Lucene webapp I get no results.
> I'm using Nutch 0.9 and Lucene 2.4.0.
> Should I use an older Lucene version like 2.0 or is this not crucial?
>
> I want to use Lucene, because of its Wildcardsearch and Fuzzysearch, ...
> Are there other possibilities to solve this?
Nutch indexes are plain Lucene indexes. The only difference is that as a
side-effect of map-reduce processing these indexes may come in several
parts, found in subdirectories named like part-xxxxx. Each subdirectory
holds a valid Lucene index.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com
Re: Using Nutch for crawling and Lucene for searching
(Wildcard/Fuzzy)
Posted by Alexander Aristov <al...@gmail.com>.
Or as an option you can modify nutch to store content in the index.
Andrzej, is it bad idea, what do you think?
Best Regards
Alexander Aristov
2009/5/14 Andrzej Bialecki <ab...@getopt.org>
> inghe wrote:
>
>>
>> Hi,
>> I want to use Nutch for crawling contents and Lucene for extract and
>> analyze
>> the contents of the index created by Nutch. I'm trying to extract from the
>> index the contents of web pages, but i don' know how to set the
>> NutchDocumentAnalyzer in my application, if i use the StandardAnalyzer of
>> Lucene, i'll get to extract the fields "title", "url" but not the
>> "content".
>> I'm using Nutch1.0 and Lucene2.4.0
>>
>
> There is no content in Lucene indexes. The original content is stored in
> Nutch segments. You can use the command bin/nutch readseg to retrieve all
> (or selected) pages.
>
>
> --
> Best regards,
> Andrzej Bialecki <><
> ___. ___ ___ ___ _ _ __________________________________
> [__ || __|__/|__||\/| Information Retrieval, Semantic Web
> ___|||__|| \| || | Embedded Unix, System Integration
> http://www.sigram.com Contact: info at sigram dot com
>
>
Re: Using Nutch for crawling and Lucene for searching (Wildcard/Fuzzy)
Posted by Andrzej Bialecki <ab...@getopt.org>.
inghe wrote:
>
> Andrzej Bialecki wrote:
>> Page content is NOT stored in Lucene indexes that Nutch creates. It's
>> only indexed, which is not the same. Luke can show you the text in the
>> "content" field only because it reconstructs it from the index. This
>> reconstruction is incomplete because some information is missing (the
>> information discarded by NutchDocumentAnalyzer).
>>
>> As I wrote before, full content is stored in Nutch segments. That's why
>> Nutch can show you the full content, but Luke cannot.
>>
>>
>
> Thanks again, but is there a method to get a "content" informations through
> the libraries of Lucene? I would like to work on the content of the web
> pages extracted.
>
As it is now - there is no method. You would have to modify Nutch to
create indexes where "content" is both indexed and stored - but then
performance of your index will suffer.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com
Re: Using Nutch for crawling and Lucene for searching
(Wildcard/Fuzzy)
Posted by inghe <in...@gmail.com>.
Andrzej Bialecki wrote:
>
> Page content is NOT stored in Lucene indexes that Nutch creates. It's
> only indexed, which is not the same. Luke can show you the text in the
> "content" field only because it reconstructs it from the index. This
> reconstruction is incomplete because some information is missing (the
> information discarded by NutchDocumentAnalyzer).
>
> As I wrote before, full content is stored in Nutch segments. That's why
> Nutch can show you the full content, but Luke cannot.
>
>
Thanks again, but is there a method to get a "content" informations through
the libraries of Lucene? I would like to work on the content of the web
pages extracted.
--
View this message in context: http://www.nabble.com/Using-Nutch-for-crawling-and-Lucene-for-searching-%28Wildcard-Fuzzy%29-tp19990219p23555198.html
Sent from the Nutch - User mailing list archive at Nabble.com.
Re: Using Nutch for crawling and Lucene for searching (Wildcard/Fuzzy)
Posted by Andrzej Bialecki <ab...@getopt.org>.
inghe wrote:
> Thank you for answer, but i have still a doubt!
> Why can i read the filed "content" in Luke, if i load the index file created
> by nutch?
> So, i load in Luke the index file created by Nutch-1.0, then I can view the
> fields "url" "title" "host" "ecc, but not all field; if i click on an Edit
> Botton opens a window that contains other fields including the field
> "content" with the his value, but as it uses the seampleAnalyzer and the
> content is not displayed correctly. I tried to change the analyzer and
> insert NutchDocumenAnalyzer but I do not know how to do it
>
> help :(
Page content is NOT stored in Lucene indexes that Nutch creates. It's
only indexed, which is not the same. Luke can show you the text in the
"content" field only because it reconstructs it from the index. This
reconstruction is incomplete because some information is missing (the
information discarded by NutchDocumentAnalyzer).
As I wrote before, full content is stored in Nutch segments. That's why
Nutch can show you the full content, but Luke cannot.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com
Re: Using Nutch for crawling and Lucene for searching
(Wildcard/Fuzzy)
Posted by inghe <in...@gmail.com>.
Thank you for answer, but i have still a doubt!
Why can i read the filed "content" in Luke, if i load the index file created
by nutch?
So, i load in Luke the index file created by Nutch-1.0, then I can view the
fields "url" "title" "host" "ecc, but not all field; if i click on an Edit
Botton opens a window that contains other fields including the field
"content" with the his value, but as it uses the seampleAnalyzer and the
content is not displayed correctly. I tried to change the analyzer and
insert NutchDocumenAnalyzer but I do not know how to do it
help :(
Andrzej Bialecki wrote:
>
> inghe wrote:
>>
>> Hi,
>> I want to use Nutch for crawling contents and Lucene for extract and
>> analyze
>> the contents of the index created by Nutch. I'm trying to extract from
>> the
>> index the contents of web pages, but i don' know how to set the
>> NutchDocumentAnalyzer in my application, if i use the StandardAnalyzer of
>> Lucene, i'll get to extract the fields "title", "url" but not the
>> "content".
>> I'm using Nutch1.0 and Lucene2.4.0
>
> There is no content in Lucene indexes. The original content is stored in
> Nutch segments. You can use the command bin/nutch readseg to retrieve
> all (or selected) pages.
>
>
> --
> Best regards,
> Andrzej Bialecki <><
> ___. ___ ___ ___ _ _ __________________________________
> [__ || __|__/|__||\/| Information Retrieval, Semantic Web
> ___|||__|| \| || | Embedded Unix, System Integration
> http://www.sigram.com Contact: info at sigram dot com
>
>
>
--
View this message in context: http://www.nabble.com/Using-Nutch-for-crawling-and-Lucene-for-searching-%28Wildcard-Fuzzy%29-tp19990219p23542476.html
Sent from the Nutch - User mailing list archive at Nabble.com.
Re: Using Nutch for crawling and Lucene for searching (Wildcard/Fuzzy)
Posted by Andrzej Bialecki <ab...@getopt.org>.
inghe wrote:
>
> Hi,
> I want to use Nutch for crawling contents and Lucene for extract and analyze
> the contents of the index created by Nutch. I'm trying to extract from the
> index the contents of web pages, but i don' know how to set the
> NutchDocumentAnalyzer in my application, if i use the StandardAnalyzer of
> Lucene, i'll get to extract the fields "title", "url" but not the "content".
> I'm using Nutch1.0 and Lucene2.4.0
There is no content in Lucene indexes. The original content is stored in
Nutch segments. You can use the command bin/nutch readseg to retrieve
all (or selected) pages.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com
Re: Using Nutch for crawling and Lucene for searching
(Wildcard/Fuzzy)
Posted by inghe <in...@gmail.com>.
Hi,
I want to use Nutch for crawling contents and Lucene for extract and analyze
the contents of the index created by Nutch. I'm trying to extract from the
index the contents of web pages, but i don' know how to set the
NutchDocumentAnalyzer in my application, if i use the StandardAnalyzer of
Lucene, i'll get to extract the fields "title", "url" but not the "content".
I'm using Nutch1.0 and Lucene2.4.0
--
View this message in context: http://www.nabble.com/Using-Nutch-for-crawling-and-Lucene-for-searching-%28Wildcard-Fuzzy%29-tp19990219p23536068.html
Sent from the Nutch - User mailing list archive at Nabble.com.