You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Rozina Sorathia <Ro...@KPITCummins.com> on 2005/11/12 07:00:42 UTC
to find if a url is present in the nutch master index
Hi ,
I tried writing a class which calls public Page getPage(String url)
method of DistributedWebDbReader.... And I passed the lucene website's
url
On debugging , I find that it goes into a sleep loop as follows:
public DistributedWebDBReader(NutchFileSystem nfs, File root) throws
IOException, FileNotFoundException {
//
// Get the current db from the given nutchfs. It consists
// of a bunch of directories full of files.
//
this.root = root;
this.dbDir = new File(new File(root, "standard"), "webdb");
//
// Wait until the webdb is complete, by waiting till a given
// file exists.
//
File dirIsComplete = new File(dbDir, "dbIsComplete");
while (! nfs.exists(dirIsComplete)) {
try {
Thread.sleep(2000);
} catch (InterruptedException ie) {
}
}
It sleeps every time in the above loop...Can anyone tell me what's the
problem??
What I actually want to do is to check whether a given url is present in
the Nutch's master index or not?
Thanks and regards,
Rozina Sorathia,
rozinas@kpitcummins.com
Phone No. (020) 5652 5000
Ext. :2206