You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by KevinKuei <kk...@above.net.tw> on 2011/01/06 05:04:02 UTC

response time increase??

Hi,

I'm planing for a youtube-like video online site and looking for a 
suitable file system.
The high performance and reliability of HDFS seems to be the great 
candidate.

But somebody told me that the response time will be linear increased 
with more data node.
I've no enough hardwares to do the test for now.

It will be very appreciated if anyone can provide such information.

Thanks!!

--
Kevin Kuei

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


Re: response time increase??

Posted by Todd Lipcon <to...@cloudera.com>.
Well, technically, 1st_arrival_time is O(number of network hops between
racks) which is O(#racks) and obviously #racks is a function of the number
of datanodes. But you're talking an extra 0.1ms or so there per network hop,
and there isn't any system which could avoid it.

But HDFS-wise, the data structures are mostly hashtable based.

-Todd

On Thu, Jan 6, 2011 at 2:01 AM, KevinKuei <kk...@above.net.tw> wrote:

>  Dear Todd,
>
> Thanks for your reply.
> Let me clarify my "response time" definition.
>
> *start_time: time of request send
> *1st_arrival_time: the 1st block of data been received
> *completed_time: the final block of data been received
>
> response time = 1st_arrival_time = start_time
>
> Are you still confirmed that  there is NO  "response_time = O(number of
> datanode)" ?
> If so, HDFS will be great for our application.  Thanks!!
>
>
> --
> Kevin Kuei
>
>
> 於 2011/1/6 下午 04:23, Todd Lipcon 提到:
>
> Hi Kevin,
>
>  No, there is no O(number of datanodes) factor in performance.
>
>  -Todd
>
> On Wed, Jan 5, 2011 at 8:04 PM, KevinKuei <kk...@above.net.tw> wrote:
>
>> Hi,
>>
>> I'm planing for a youtube-like video online site and looking for a
>> suitable file system.
>> The high performance and reliability of HDFS seems to be the great
>> candidate.
>>
>> But somebody told me that the response time will be linear increased with
>> more data node.
>> I've no enough hardwares to do the test for now.
>>
>> It will be very appreciated if anyone can provide such information.
>>
>> Thanks!!
>>
>> --
>> Kevin Kuei
>>
>> --
>> This message has been scanned for viruses and
>> dangerous content by MailScanner, and is
>> believed to be clean.
>>
>>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>
> --
> This message has been scanned for viruses and
> dangerous content by *MailScanner* <http://www.mailscanner.info/>, and is
> believed to be clean.
>
>
>
> --
> This message has been scanned for viruses and
> dangerous content by *MailScanner* <http://www.mailscanner.info/>, and is
> believed to be clean.
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Re: response time increase??

Posted by KevinKuei <kk...@above.net.tw>.
Dear Todd,

Thanks for your reply.
Let me clarify my "response time" definition.

*start_time: time of request send
*1st_arrival_time: the 1st block of data been received
*completed_time: the final block of data been received

response time = 1st_arrival_time = start_time

Are you still confirmed that  there is NO  "response_time = O(number of 
datanode)" ?
If so, HDFS will be great for our application.  Thanks!!


--
Kevin Kuei


? 2011/1/6 ?? 04:23, Todd Lipcon ??:
> Hi Kevin,
>
> No, there is no O(number of datanodes) factor in performance.
>
> -Todd
>
> On Wed, Jan 5, 2011 at 8:04 PM, KevinKuei <kkuei@above.net.tw 
> <ma...@above.net.tw>> wrote:
>
>     Hi,
>
>     I'm planing for a youtube-like video online site and looking for a
>     suitable file system.
>     The high performance and reliability of HDFS seems to be the great
>     candidate.
>
>     But somebody told me that the response time will be linear
>     increased with more data node.
>     I've no enough hardwares to do the test for now.
>
>     It will be very appreciated if anyone can provide such information.
>
>     Thanks!!
>
>     --
>     Kevin Kuei
>
>     -- 
>     This message has been scanned for viruses and
>     dangerous content by MailScanner, and is
>     believed to be clean.
>
>
>
>
> -- 
> Todd Lipcon
> Software Engineer, Cloudera
>
> -- 
> This message has been scanned for viruses and
> dangerous content by *MailScanner* <http://www.mailscanner.info/>, and is
> believed to be clean. 


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


Re: response time increase??

Posted by Todd Lipcon <to...@cloudera.com>.
Hi Kevin,

No, there is no O(number of datanodes) factor in performance.

-Todd

On Wed, Jan 5, 2011 at 8:04 PM, KevinKuei <kk...@above.net.tw> wrote:

> Hi,
>
> I'm planing for a youtube-like video online site and looking for a suitable
> file system.
> The high performance and reliability of HDFS seems to be the great
> candidate.
>
> But somebody told me that the response time will be linear increased with
> more data node.
> I've no enough hardwares to do the test for now.
>
> It will be very appreciated if anyone can provide such information.
>
> Thanks!!
>
> --
> Kevin Kuei
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
>


-- 
Todd Lipcon
Software Engineer, Cloudera