You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oodt.apache.org by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov> on 2011/07/08 20:06:31 UTC
Re: cas crawler multi-threaded
Hi Robert,
Thanks for your question. Answers below:
On Jul 7, 2011, at 11:24 AM, Ando, Robert R (388K) wrote:
> Chris,
>
> not sure who to ask.
>
> Is the cas-crawler multi-threaded? If a speedup is needed,
> would it be hard to do so? (There are many
> other ways to speed up archiving files.)
Cas-crawler is intentionally not multi-threaded by default, however
the architecture of the system deals with that by allowing multiple
crawlers to be run on a single directory area. The way you can
prevent them from trampling over one another is through the use
of PreConditionComparators and Actions to isolate what type of
files the crawler should crawl, or via noRecur and crawlForDirs as
options to isolate as well.
Another strategy is separating out the ingest/staging area by directory type
and then instantiating multiple crawlers based on that organization.
Does that help/make sense? We can chat more but I thought that
would be a good start to the conversation.
Cheers,
Chris
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW: http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++