You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Tolga <to...@ozses.net> on 2012/05/23 08:39:35 UTC

One last question

Thank you all, especially Lewis, Markus, and whomever I might have 
forgotten! It is working; I can crawl, index and search.

One last question though. On my drupal website, I am redirecting 
www.example.com to example.com. However, I noticed that nutch doesn't 
crawl the web site if there is a rewrite rule involved. Is there a 
workaround?

Thanks a lot!

Re: One last question

Posted by Lewis John Mcgibbney <le...@gmail.com>.
Please check out the http.redirect.max property in your nutch-default
and subsequently nutch-site.xml file. This should be set to a
responsible level taking into consideration the nature of the pages
you are crawling.

hth

Lewis



On Wed, May 23, 2012 at 9:40 AM, Tolga <to...@ozses.net> wrote:
> Yes, a redirect.
>
>
> On 5/23/12 11:37 AM, Lewis John Mcgibbney wrote:
>>
>> Can you please elaborate on a re-write rule? Do you mean a redirect?
>>
>> On Wed, May 23, 2012 at 7:39 AM, Tolga<to...@ozses.net>  wrote:
>>>
>>> Thank you all, especially Lewis, Markus, and whomever I might have
>>> forgotten! It is working; I can crawl, index and search.
>>>
>>> One last question though. On my drupal website, I am redirecting
>>> www.example.com to example.com. However, I noticed that nutch doesn't
>>> crawl
>>> the web site if there is a rewrite rule involved. Is there a workaround?
>>>
>>> Thanks a lot!
>>
>>
>>
>



-- 
Lewis

Re: One last question

Posted by Tolga <to...@ozses.net>.
Yes, a redirect.

On 5/23/12 11:37 AM, Lewis John Mcgibbney wrote:
> Can you please elaborate on a re-write rule? Do you mean a redirect?
>
> On Wed, May 23, 2012 at 7:39 AM, Tolga<to...@ozses.net>  wrote:
>> Thank you all, especially Lewis, Markus, and whomever I might have
>> forgotten! It is working; I can crawl, index and search.
>>
>> One last question though. On my drupal website, I am redirecting
>> www.example.com to example.com. However, I noticed that nutch doesn't crawl
>> the web site if there is a rewrite rule involved. Is there a workaround?
>>
>> Thanks a lot!
>
>

Re: One last question

Posted by Lewis John Mcgibbney <le...@gmail.com>.
Can you please elaborate on a re-write rule? Do you mean a redirect?

On Wed, May 23, 2012 at 7:39 AM, Tolga <to...@ozses.net> wrote:
> Thank you all, especially Lewis, Markus, and whomever I might have
> forgotten! It is working; I can crawl, index and search.
>
> One last question though. On my drupal website, I am redirecting
> www.example.com to example.com. However, I noticed that nutch doesn't crawl
> the web site if there is a rewrite rule involved. Is there a workaround?
>
> Thanks a lot!



-- 
Lewis