You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Lars Götte <la...@drive.eu> on 2017/05/16 13:42:03 UTC

Duplicate content http/https

Dear community,

I’ve got duplicate urls for http and https urls in search result. Is there any configuration to stop this behaviour?

Thank you,

Lars

RE: Duplicate content http/https

Posted by Markus Jelsma <ma...@openindex.io>.
Hi - use urlnormalizer-protocol for this.

Markus

 
 
-----Original message-----
> From:Lars Götte <la...@drive.eu>
> Sent: Tuesday 16th May 2017 15:42
> To: user@nutch.apache.org
> Subject: Duplicate content http/https
> 
> Dear community,
> 
> I’ve got duplicate urls for http and https urls in search result. Is there any configuration to stop this behaviour?
> 
> Thank you,
> 
> Lars