You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by mrm <ma...@skimlinks.com> on 2014/07/10 11:58:34 UTC

running scrapy (or any other scraper) on the cluster?

Hi all,

Has anybody tried to run scrapy on a cluster? If yes, I would appreciate
hearing about the general approach that was taken (multiple spiders? single
spider? how to distribute urls across nodes?...etc). I would also be
interested in hearing about any experience running a different scraper on a
cluster, maybe scrapy is not the best one.

Thank you!

Maria



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/running-scrapy-or-any-other-scraper-on-the-cluster-tp9286.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.