You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by A Laxmi <a....@gmail.com> on 2014/04/09 16:42:48 UTC

Nutch 2.2.1: Web Content size of a particular website

Hi,

I might not be thinking in the right direction so need some help. Is there
a way to find an approximate web content size of a particular website in
Nutch 2.2.1?

I have crawled a research website which has lot of images, pdfs, etc. and I
am interested to know the content size of all the files in that website.
Please advise.

Thanks..