You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Pablo Mayrgundter <pa...@gmail.com> on 2005/05/11 19:23:20 UTC

NDFS Questions

Hi,

I'm testing a deployment of Nutch at work and am trying to decide what
filesystem to use.  I got the NDFS demo working, and am excited to use
it, but it looks pretty new.  Should I consider using it for
production?  I'm considering storing quite a lot of data, in the
10-100 TB range.

Also, I'm wondering about the read/write performance.  From some
initial testing, it looks like I'm not getting any speedup reading
from two data nodes compared to reading the same data from a single
host using a program like scp.  I'm wondering if any performance
tuning has been done yet on ndfs.

Cheers,
Pablo Mayrgundter

Re: NDFS Questions

Posted by Doug Cutting <cu...@nutch.org>.
Pablo Mayrgundter wrote:
> I'm testing a deployment of Nutch at work and am trying to decide what
> filesystem to use.  I got the NDFS demo working, and am excited to use
> it, but it looks pretty new.  Should I consider using it for
> production?  I'm considering storing quite a lot of data, in the
> 10-100 TB range.

I would not yet recommend NDFS for production.  Soon, though.

> Also, I'm wondering about the read/write performance.  From some
> initial testing, it looks like I'm not getting any speedup reading
> from two data nodes compared to reading the same data from a single
> host using a program like scp.  I'm wondering if any performance
> tuning has been done yet on ndfs.

No, very little performance tuning has been done yet.

Doug