You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by HarishKashyap TS <Ha...@infosys.com> on 2009/09/22 16:25:20 UTC

HDFS single node cluster vs. NTFS performance comparison

Hi All,

Has performance testing and comparison of HDFS single node cluster vs. NTFS file systems been performed? Any sample results of HDFS single node vs. NTFS performance comparison available?

Your input/feedback regarding this would be very helpful.

Regards,
Harish Kashyap


Re: HDFS single node cluster vs. NTFS performance comparison

Posted by Anthony Urso <an...@gmail.com>.
The Annals of Improbable Research may be interested.  I believe they
recently published a study comparing apples and oranges.

Cheers,
Anthony

On Wed, Sep 23, 2009 at 8:11 AM, HarishKashyap TS
<Ha...@infosys.com> wrote:
> Hi All,
>
>
>
> I have completed a performance testing activity of HDFS single node vs. NTFS
> file systems. Modified versions of SLG tools provided by Hadoop has been
> utilized for this activity. Under similar environment conditions,
> performance of the two file systems has been compared across various file
> operations.
>
> From our tests, statistics related to the amount of overhead introduced by
> HDFS can be obtained.
>
> For E.g. If number of file created is considers as a metric, then, local
> file system (NTFS) performs 30% better when compared to HDFS.
>
>
>
> We are planning to publish an article on this. Suggestions about the
> technical forums, where the publication of this article would be
> appropriate, will be of great help.
>
>
>
> Aaron,
>
> Thanks a lot for your inputs and time.
>
>
>
> Regards,
>
> Harish Kashyap
>
>
>
> From: Aaron Kimball [mailto:aaron@cloudera.com]
> Sent: Wednesday, September 23, 2009 1:49 AM
> To: hdfs-user@hadoop.apache.org
> Subject: Re: HDFS single node cluster vs. NTFS performance comparison
>
>
>
> To my knowledge, nobody's benchmarked this in a rigorous fashion. It's
> virtually certain, though, that on the same machine, NTFS would perform
> faster. HDFS does not directly write to the disk driver, it uses the local
> filesystem of the node on which it's installed. So any HDFS writes would
> themselves be channeled through NTFS and then down to the disk. The read
> path, of course, would go through NTFS first and then via HDFS out to the
> client.
>
> So, HDFS can only add overhead. How much overhead is probably not a
> published number.
>
> - Aaron
>
> On Tue, Sep 22, 2009 at 7:25 AM, HarishKashyap TS
> <Ha...@infosys.com> wrote:
>
> Hi All,
>
>
>
> Has performance testing and comparison of HDFS single node cluster vs. NTFS
> file systems been performed? Any sample results of HDFS single node vs. NTFS
> performance comparison available?
>
>
>
> Your input/feedback regarding this would be very helpful.
>
>
>
> Regards,
>
> Harish Kashyap
>
>
>
>

RE: HDFS single node cluster vs. NTFS performance comparison

Posted by HarishKashyap TS <Ha...@infosys.com>.
Hi All,

I have completed a performance testing activity of HDFS single node vs. NTFS file systems. Modified versions of SLG tools provided by Hadoop has been utilized for this activity. Under similar environment conditions, performance of the two file systems has been compared across various file operations.
>From our tests, statistics related to the amount of overhead introduced by HDFS can be obtained.
For E.g. If number of file created is considers as a metric, then, local file system (NTFS) performs 30% better when compared to HDFS.

We are planning to publish an article on this. Suggestions about the technical forums, where the publication of this article would be appropriate, will be of great help.

Aaron,
Thanks a lot for your inputs and time.

Regards,
Harish Kashyap

From: Aaron Kimball [mailto:aaron@cloudera.com]
Sent: Wednesday, September 23, 2009 1:49 AM
To: hdfs-user@hadoop.apache.org
Subject: Re: HDFS single node cluster vs. NTFS performance comparison

To my knowledge, nobody's benchmarked this in a rigorous fashion. It's virtually certain, though, that on the same machine, NTFS would perform faster. HDFS does not directly write to the disk driver, it uses the local filesystem of the node on which it's installed. So any HDFS writes would themselves be channeled through NTFS and then down to the disk. The read path, of course, would go through NTFS first and then via HDFS out to the client.

So, HDFS can only add overhead. How much overhead is probably not a published number.

- Aaron
On Tue, Sep 22, 2009 at 7:25 AM, HarishKashyap TS <Ha...@infosys.com>> wrote:

Hi All,



Has performance testing and comparison of HDFS single node cluster vs. NTFS file systems been performed? Any sample results of HDFS single node vs. NTFS performance comparison available?



Your input/feedback regarding this would be very helpful.



Regards,

Harish Kashyap




Re: HDFS single node cluster vs. NTFS performance comparison

Posted by Aaron Kimball <aa...@cloudera.com>.
To my knowledge, nobody's benchmarked this in a rigorous fashion. It's
virtually certain, though, that on the same machine, NTFS would perform
faster. HDFS does not directly write to the disk driver, it uses the local
filesystem of the node on which it's installed. So any HDFS writes would
themselves be channeled through NTFS and then down to the disk. The read
path, of course, would go through NTFS first and then via HDFS out to the
client.

So, HDFS can only add overhead. How much overhead is probably not a
published number.

- Aaron

On Tue, Sep 22, 2009 at 7:25 AM, HarishKashyap TS <
HarishKashyap_TS@infosys.com> wrote:

>  Hi All,
>
>
>
> Has performance testing and comparison of HDFS *single node* cluster vs.
> NTFS file systems been performed? Any sample results of HDFS single node vs.
> NTFS performance comparison available?
>
>
>
> Your input/feedback regarding this would be very helpful.
>
>
>
> Regards,
>
> Harish Kashyap
>
>
>