You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Himanshu Jindal <ca...@gmail.com> on 2015/07/02 02:49:49 UTC

Scalability of Name Node for external clients

I have a question regarding scalability of name node. Typically the name
node handles 2 type of clients:
1. Internal clients (data nodes - part of the hadoop cluster)
2. External clients (client nodes requesting for block locations in order
to perform read/writes on data nodes)


I am not much concerned about the throughput of Internal clients, However I
am more worried about throughput of the external clients. So what is
expected throughput of operations on name-node for external clients and how
scalable it is? To be more precise, Please look at following example:

There is a typical Name Node server running a cluster of 100 data nodes.
Now assuming the Internal clients use default block reports and heartbeat
requests, I have following questions regarding scalability of the NameNode:
1. What is number of simultaneous external clients connections the Name
Node can support? (a hundred thousands?)
2. What is the number of operations (get block locations) served per second?
3. What are different ways to increase the throughput for these external
clients?

Thanks
Himanshu

Scalability of Name Node for external clients

Posted by Himanshu Jindal <ca...@gmail.com>.
I have a question regarding scalability of name node. Typically the name
node handles 2 type of clients:
1. Internal clients (data nodes - part of the hadoop cluster)
2. External clients (client nodes requesting for block locations in order
to perform read/writes on data nodes)

I am not much concerned about the throughput of Internal clients, However I
am more worried about throughput of the external clients. So what is
expected throughput of operations on name-node for external clients and how
scalable it is? To be more precise, Please look at following example:

There is a typical Name Node server running a cluster of 100 data nodes.
Now assuming the Internal clients use default block reports and heartbeat
requests, I have following questions regarding scalability of the NameNode:
1. What is number of simultaneous external clients connections the Name
Node can support? (a hundred thousands?)
2. What is the number of operations (get block locations) served per second?
3. What are different ways to increase the throughput for these external
clients?

Thanks
Himanshu

Scalability of Name Node for external clients

Posted by Himanshu Jindal <ca...@gmail.com>.
I have a question regarding scalability of name node. Typically the name
node handles 2 type of clients:
1. Internal clients (data nodes - part of the hadoop cluster)
2. External clients (client nodes requesting for block locations in order
to perform read/writes on data nodes)

I am not much concerned about the throughput of Internal clients, However I
am more worried about throughput of the external clients. So what is
expected throughput of operations on name-node for external clients and how
scalable it is? To be more precise, Please look at following example:

There is a typical Name Node server running a cluster of 100 data nodes.
Now assuming the Internal clients use default block reports and heartbeat
requests, I have following questions regarding scalability of the NameNode:
1. What is number of simultaneous external clients connections the Name
Node can support? (a hundred thousands?)
2. What is the number of operations (get block locations) served per second?
3. What are different ways to increase the throughput for these external
clients?

Thanks
Himanshu

Scalability of Name Node for external clients

Posted by Himanshu Jindal <ca...@gmail.com>.
I have a question regarding scalability of name node. Typically the name
node handles 2 type of clients:
1. Internal clients (data nodes - part of the hadoop cluster)
2. External clients (client nodes requesting for block locations in order
to perform read/writes on data nodes)

I am not much concerned about the throughput of Internal clients, However I
am more worried about throughput of the external clients. So what is
expected throughput of operations on name-node for external clients and how
scalable it is? To be more precise, Please look at following example:

There is a typical Name Node server running a cluster of 100 data nodes.
Now assuming the Internal clients use default block reports and heartbeat
requests, I have following questions regarding scalability of the NameNode:
1. What is number of simultaneous external clients connections the Name
Node can support? (a hundred thousands?)
2. What is the number of operations (get block locations) served per second?
3. What are different ways to increase the throughput for these external
clients?

Thanks
Himanshu

Scalability of Name Node for external clients

Posted by Himanshu Jindal <ca...@gmail.com>.
I have a question regarding scalability of name node. Typically the name
node handles 2 type of clients:
1. Internal clients (data nodes - part of the hadoop cluster)
2. External clients (client nodes requesting for block locations in order
to perform read/writes on data nodes)

I am not much concerned about the throughput of Internal clients, However I
am more worried about throughput of the external clients. So what is
expected throughput of operations on name-node for external clients and how
scalable it is? To be more precise, Please look at following example:

There is a typical Name Node server running a cluster of 100 data nodes.
Now assuming the Internal clients use default block reports and heartbeat
requests, I have following questions regarding scalability of the NameNode:
1. What is number of simultaneous external clients connections the Name
Node can support? (a hundred thousands?)
2. What is the number of operations (get block locations) served per second?
3. What are different ways to increase the throughput for these external
clients?

Thanks
Himanshu