You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Rozina Sorathia <Ro...@KPITCummins.com> on 2005/11/12 07:00:42 UTC

to find if a url is present in the nutch master index

  

 

Hi , 

I tried writing a class which calls public Page getPage(String url)
method of DistributedWebDbReader.... And I passed the lucene website's
url

On debugging , I find that it goes into a sleep loop as follows:

public DistributedWebDBReader(NutchFileSystem nfs, File root) throws
IOException, FileNotFoundException {

        //

        // Get the current db from the given nutchfs.  It consists

        // of a bunch of directories full of files.  

        //

        this.root = root;

        this.dbDir = new File(new File(root, "standard"), "webdb");

 

        //

        // Wait until the webdb is complete, by waiting till a given

        // file exists.

        //

        File dirIsComplete = new File(dbDir, "dbIsComplete");

        while (! nfs.exists(dirIsComplete)) {

            try {

                Thread.sleep(2000);

            } catch (InterruptedException ie) {

            }

        }

 

It sleeps every time in the above loop...Can anyone tell me what's the
problem?? 

What I actually want to do is to check whether a given url is present in
the Nutch's master index or not?

 

 

 Thanks and regards,

Rozina Sorathia,

rozinas@kpitcummins.com

 

 

Phone No. (020) 5652 5000  

Ext. :2206