You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@manifoldcf.apache.org by h0444xk8 <h0...@posteo.de> on 2022/01/31 11:10:28 UTC
Web connector HTTP response header content-encoding:gzip
Hi all,
I have a question regarding the Web connector. I want to crawl a website
which responses with compressed HTTP. The returned HTTP response header
is content-encoding:gzip.
Is Manifold able to handle such compressed websites? If I try to crawl
the website there is a 404 displayed in the history. But from the
machine the site is accessable and if I try to crawl a non compressed
website, all works fine. So the 404 is maybe missleading.
Can I set the accepted-encoding header for the Web connector repo to
avoid getting compressed content?
Or is there any other solution?
Regards
Sebastian