You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by "pesmadhu ." <pe...@gmail.com> on 2016/04/06 11:42:28 UTC
Apache Nutch : query
Hi,
We have a requirement to scrape the urls data which contains table data,
we need to read the table content and depending on some column value of
table data we need to download the file.
Example urls : http://exporter.nih.gov/ExPORTER_Catalog.aspx
http://exporter.nih.gov/ExPORTER_Catalog.aspx?sid=3&index=0
http://exporter.nih.gov/ExPORTER_Catalog.aspx?sid=0&index=1
Please check and suggest can we achieve this using Apache Nutch.
I have one more query, what is the main usage of Apache Nutch.
--
Regards,
Madhusudhan.