You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Nicolas MARTIN <ni...@gmail.com> on 2009/02/17 14:32:20 UTC

indexing after fetching

I need to know if Nutch necessarily index data that have been fetched when
running the bin/crawl command ?

Regards

Re: indexing after fetching

Posted by Srinivas Gokavarapu <sr...@gmail.com>.
hi

        First check in logs/hadoop.log if the page is fetched properly and
also check if the webpage contains the query word. Check the name of the
crawl folder. The name of the folder of the crawl should be "crawl", if you
want to change it you can change it conf/nutch-default.xml, searcher.dir
property. Ensure these things are correct.

Srinivas.

On Wed, Feb 18, 2009 at 7:54 AM, Nicolas MARTIN <ni...@gmail.com> wrote:

> Ok thank you.
> I allow to ask you another question which go even further : i processed
> bin/crawl command for my first crawl and i can't search them with the nutch
> interface typing a keyword (e.g "homepage" because i processed a crawl of a
> homepage). I joined a print screen.
>
> If you find some time, help would be nice.
> Cheers,
>
> Nicolas.
>
> 2009/2/17 Sami Siren <ss...@gmail.com>
>
> Nicolas MARTIN wrote:
>>
>>> I need to know if Nutch necessarily index data that have been fetched
>>> when
>>> running the bin/crawl command ?
>>>
>>>
>> Hi,
>>
>> bin/nutch crawl command will index the data at the end of the cycle. If
>> you do not wish to index just use the individual commands
>> inject, generate, fetch, updatedb, generate...
>> --
>> Sami Siren
>>
>>
>

Re: indexing after fetching

Posted by Nicolas MARTIN <ni...@gmail.com>.
Ok thank you.
I allow to ask you another question which go even further : i processed
bin/crawl command for my first crawl and i can't search them with the nutch
interface typing a keyword (e.g "homepage" because i processed a crawl of a
homepage). I joined a print screen.

If you find some time, help would be nice.
Cheers,

Nicolas.

2009/2/17 Sami Siren <ss...@gmail.com>

> Nicolas MARTIN wrote:
>
>> I need to know if Nutch necessarily index data that have been fetched when
>> running the bin/crawl command ?
>>
>>
> Hi,
>
> bin/nutch crawl command will index the data at the end of the cycle. If you
> do not wish to index just use the individual commands
> inject, generate, fetch, updatedb, generate...
> --
> Sami Siren
>
>

Re: indexing after fetching

Posted by Sami Siren <ss...@gmail.com>.
Nicolas MARTIN wrote:
> I need to know if Nutch necessarily index data that have been fetched when
> running the bin/crawl command ?
>   
Hi,

bin/nutch crawl command will index the data at the end of the cycle. If 
you do not wish to index just use the individual commands
inject, generate, fetch, updatedb, generate...  

--
 Sami Siren