You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Tolga <to...@ozses.net> on 2012/05/23 08:39:35 UTC
One last question
Thank you all, especially Lewis, Markus, and whomever I might have
forgotten! It is working; I can crawl, index and search.
One last question though. On my drupal website, I am redirecting
www.example.com to example.com. However, I noticed that nutch doesn't
crawl the web site if there is a rewrite rule involved. Is there a
workaround?
Thanks a lot!
Re: One last question
Posted by Lewis John Mcgibbney <le...@gmail.com>.
Please check out the http.redirect.max property in your nutch-default
and subsequently nutch-site.xml file. This should be set to a
responsible level taking into consideration the nature of the pages
you are crawling.
hth
Lewis
On Wed, May 23, 2012 at 9:40 AM, Tolga <to...@ozses.net> wrote:
> Yes, a redirect.
>
>
> On 5/23/12 11:37 AM, Lewis John Mcgibbney wrote:
>>
>> Can you please elaborate on a re-write rule? Do you mean a redirect?
>>
>> On Wed, May 23, 2012 at 7:39 AM, Tolga<to...@ozses.net> wrote:
>>>
>>> Thank you all, especially Lewis, Markus, and whomever I might have
>>> forgotten! It is working; I can crawl, index and search.
>>>
>>> One last question though. On my drupal website, I am redirecting
>>> www.example.com to example.com. However, I noticed that nutch doesn't
>>> crawl
>>> the web site if there is a rewrite rule involved. Is there a workaround?
>>>
>>> Thanks a lot!
>>
>>
>>
>
--
Lewis
Re: One last question
Posted by Tolga <to...@ozses.net>.
Yes, a redirect.
On 5/23/12 11:37 AM, Lewis John Mcgibbney wrote:
> Can you please elaborate on a re-write rule? Do you mean a redirect?
>
> On Wed, May 23, 2012 at 7:39 AM, Tolga<to...@ozses.net> wrote:
>> Thank you all, especially Lewis, Markus, and whomever I might have
>> forgotten! It is working; I can crawl, index and search.
>>
>> One last question though. On my drupal website, I am redirecting
>> www.example.com to example.com. However, I noticed that nutch doesn't crawl
>> the web site if there is a rewrite rule involved. Is there a workaround?
>>
>> Thanks a lot!
>
>
Re: One last question
Posted by Lewis John Mcgibbney <le...@gmail.com>.
Can you please elaborate on a re-write rule? Do you mean a redirect?
On Wed, May 23, 2012 at 7:39 AM, Tolga <to...@ozses.net> wrote:
> Thank you all, especially Lewis, Markus, and whomever I might have
> forgotten! It is working; I can crawl, index and search.
>
> One last question though. On my drupal website, I am redirecting
> www.example.com to example.com. However, I noticed that nutch doesn't crawl
> the web site if there is a rewrite rule involved. Is there a workaround?
>
> Thanks a lot!
--
Lewis