You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Laurens Bronwasser <La...@imc.nl> on 2014/09/01 09:55:50 UTC

Replication factor affecting write performance

Hi,
We have a setup with two clusters.
On cluster shows a very strong degradation when we increase the replication factor.
Another cluster shows hardly any degradation with increased replication factor.

Any idea how to find out the bottleneck in the slower cluster?

[cid:1CB8F7F8-8572-407C-B14C-AF3E8B665E8C]


Thanks,
Laurens

________________________________

The information in this e-mail is intended only for the person or entity to which it is addressed.

It may contain confidential and /or privileged material. If someone other than the intended recipient should receive this e-mail, he / she shall not be entitled to read, disseminate, disclose or duplicate it.

If you receive this e-mail unintentionally, please inform us immediately by "reply" and then delete it from your system. Although this information has been compiled with great care, neither IMC Financial Markets & Asset Management nor any of its related entities shall accept any responsibility for any errors, omissions or other inaccuracies in this information or for the consequences thereof, nor shall it be bound in any way by the contents of this e-mail or its attachments. In the event of incomplete or incorrect transmission, please return the e-mail to the sender and permanently delete this message and any attachments.

Messages and attachments are scanned for all known viruses. Always scan attachments before opening them.

RE: Replication factor affecting write performance

Posted by Vimal Jain <vk...@gmail.com>.
Mike,
Plz send email to user-unsubscribe@hbase.apache.org
Dont spam entire mailing list.

Unsubscribe





*From:* Stanley Shi [mailto:sshi@pivotal.io]
*Sent:* Monday, September 01, 2014 7:31 PM
*To:* user@hadoop.apache.org
*Cc:* Julien Lehuen; Tyler McDougall
*Subject:* Re: Replication factor affecting write performance



What's the network setup and topology?

Also, the size of the cluster?



On Mon, Sep 1, 2014 at 4:10 PM, Laurens Bronwasser <
Laurens.Bronwasser@imc.nl> wrote:

And now with the right label on the Y-axis.





*From: *Microsoft Office User <la...@imc.nl>
*Date: *Monday, September 1, 2014 at 9:56 AM
*To: *"user@hadoop.apache.org" <us...@hadoop.apache.org>
*Cc: *Julien Lehuen <ju...@imc.nl>, Tyler McDougall <
Tyler.McDougall@imc.nl>
*Subject: *Replication factor affecting write performance



Hi,

We have a setup with two clusters.

On cluster shows a very strong degradation when we increase the replication
factor.

Another cluster shows hardly any degradation with increased replication
factor.



Any idea how to find out the bottleneck in the slower cluster?






------------------------------


The information in this e-mail is intended only for the person or entity to
which it is addressed.

It may contain confidential and /or privileged material. If someone other
than the intended recipient should receive this e-mail, he / she shall not
be entitled to read, disseminate, disclose or duplicate it.

If you receive this e-mail unintentionally, please inform us immediately by
"reply" and then delete it from your system. Although this information has
been compiled with great care, neither IMC Financial Markets & Asset
Management nor any of its related entities shall accept any responsibility
for any errors, omissions or other inaccuracies in this information or for
the consequences thereof, nor shall it be bound in any way by the contents
of this e-mail or its attachments. In the event of incomplete or incorrect
transmission, please return the e-mail to the sender and permanently delete
this message and any attachments.

Messages and attachments are scanned for all known viruses. Always scan
attachments before opening them.





-- 

Regards,

*Stanley Shi,*

RE: Replication factor affecting write performance

Posted by Vimal Jain <vk...@gmail.com>.
Mike,
Plz send email to user-unsubscribe@hbase.apache.org
Dont spam entire mailing list.

Unsubscribe





*From:* Stanley Shi [mailto:sshi@pivotal.io]
*Sent:* Monday, September 01, 2014 7:31 PM
*To:* user@hadoop.apache.org
*Cc:* Julien Lehuen; Tyler McDougall
*Subject:* Re: Replication factor affecting write performance



What's the network setup and topology?

Also, the size of the cluster?



On Mon, Sep 1, 2014 at 4:10 PM, Laurens Bronwasser <
Laurens.Bronwasser@imc.nl> wrote:

And now with the right label on the Y-axis.





*From: *Microsoft Office User <la...@imc.nl>
*Date: *Monday, September 1, 2014 at 9:56 AM
*To: *"user@hadoop.apache.org" <us...@hadoop.apache.org>
*Cc: *Julien Lehuen <ju...@imc.nl>, Tyler McDougall <
Tyler.McDougall@imc.nl>
*Subject: *Replication factor affecting write performance



Hi,

We have a setup with two clusters.

On cluster shows a very strong degradation when we increase the replication
factor.

Another cluster shows hardly any degradation with increased replication
factor.



Any idea how to find out the bottleneck in the slower cluster?






------------------------------


The information in this e-mail is intended only for the person or entity to
which it is addressed.

It may contain confidential and /or privileged material. If someone other
than the intended recipient should receive this e-mail, he / she shall not
be entitled to read, disseminate, disclose or duplicate it.

If you receive this e-mail unintentionally, please inform us immediately by
"reply" and then delete it from your system. Although this information has
been compiled with great care, neither IMC Financial Markets & Asset
Management nor any of its related entities shall accept any responsibility
for any errors, omissions or other inaccuracies in this information or for
the consequences thereof, nor shall it be bound in any way by the contents
of this e-mail or its attachments. In the event of incomplete or incorrect
transmission, please return the e-mail to the sender and permanently delete
this message and any attachments.

Messages and attachments are scanned for all known viruses. Always scan
attachments before opening them.





-- 

Regards,

*Stanley Shi,*

RE: Replication factor affecting write performance

Posted by Vimal Jain <vk...@gmail.com>.
Mike,
Plz send email to user-unsubscribe@hbase.apache.org
Dont spam entire mailing list.

Unsubscribe





*From:* Stanley Shi [mailto:sshi@pivotal.io]
*Sent:* Monday, September 01, 2014 7:31 PM
*To:* user@hadoop.apache.org
*Cc:* Julien Lehuen; Tyler McDougall
*Subject:* Re: Replication factor affecting write performance



What's the network setup and topology?

Also, the size of the cluster?



On Mon, Sep 1, 2014 at 4:10 PM, Laurens Bronwasser <
Laurens.Bronwasser@imc.nl> wrote:

And now with the right label on the Y-axis.





*From: *Microsoft Office User <la...@imc.nl>
*Date: *Monday, September 1, 2014 at 9:56 AM
*To: *"user@hadoop.apache.org" <us...@hadoop.apache.org>
*Cc: *Julien Lehuen <ju...@imc.nl>, Tyler McDougall <
Tyler.McDougall@imc.nl>
*Subject: *Replication factor affecting write performance



Hi,

We have a setup with two clusters.

On cluster shows a very strong degradation when we increase the replication
factor.

Another cluster shows hardly any degradation with increased replication
factor.



Any idea how to find out the bottleneck in the slower cluster?






------------------------------


The information in this e-mail is intended only for the person or entity to
which it is addressed.

It may contain confidential and /or privileged material. If someone other
than the intended recipient should receive this e-mail, he / she shall not
be entitled to read, disseminate, disclose or duplicate it.

If you receive this e-mail unintentionally, please inform us immediately by
"reply" and then delete it from your system. Although this information has
been compiled with great care, neither IMC Financial Markets & Asset
Management nor any of its related entities shall accept any responsibility
for any errors, omissions or other inaccuracies in this information or for
the consequences thereof, nor shall it be bound in any way by the contents
of this e-mail or its attachments. In the event of incomplete or incorrect
transmission, please return the e-mail to the sender and permanently delete
this message and any attachments.

Messages and attachments are scanned for all known viruses. Always scan
attachments before opening them.





-- 

Regards,

*Stanley Shi,*

RE: Replication factor affecting write performance

Posted by Vimal Jain <vk...@gmail.com>.
Mike,
Plz send email to user-unsubscribe@hbase.apache.org
Dont spam entire mailing list.

Unsubscribe





*From:* Stanley Shi [mailto:sshi@pivotal.io]
*Sent:* Monday, September 01, 2014 7:31 PM
*To:* user@hadoop.apache.org
*Cc:* Julien Lehuen; Tyler McDougall
*Subject:* Re: Replication factor affecting write performance



What's the network setup and topology?

Also, the size of the cluster?



On Mon, Sep 1, 2014 at 4:10 PM, Laurens Bronwasser <
Laurens.Bronwasser@imc.nl> wrote:

And now with the right label on the Y-axis.





*From: *Microsoft Office User <la...@imc.nl>
*Date: *Monday, September 1, 2014 at 9:56 AM
*To: *"user@hadoop.apache.org" <us...@hadoop.apache.org>
*Cc: *Julien Lehuen <ju...@imc.nl>, Tyler McDougall <
Tyler.McDougall@imc.nl>
*Subject: *Replication factor affecting write performance



Hi,

We have a setup with two clusters.

On cluster shows a very strong degradation when we increase the replication
factor.

Another cluster shows hardly any degradation with increased replication
factor.



Any idea how to find out the bottleneck in the slower cluster?






------------------------------


The information in this e-mail is intended only for the person or entity to
which it is addressed.

It may contain confidential and /or privileged material. If someone other
than the intended recipient should receive this e-mail, he / she shall not
be entitled to read, disseminate, disclose or duplicate it.

If you receive this e-mail unintentionally, please inform us immediately by
"reply" and then delete it from your system. Although this information has
been compiled with great care, neither IMC Financial Markets & Asset
Management nor any of its related entities shall accept any responsibility
for any errors, omissions or other inaccuracies in this information or for
the consequences thereof, nor shall it be bound in any way by the contents
of this e-mail or its attachments. In the event of incomplete or incorrect
transmission, please return the e-mail to the sender and permanently delete
this message and any attachments.

Messages and attachments are scanned for all known viruses. Always scan
attachments before opening them.





-- 

Regards,

*Stanley Shi,*

RE: Replication factor affecting write performance

Posted by mike Zarrin <mi...@unitedrmr.com>.
Unsubscribe

 

 

From: Stanley Shi [mailto:sshi@pivotal.io] 
Sent: Monday, September 01, 2014 7:31 PM
To: user@hadoop.apache.org
Cc: Julien Lehuen; Tyler McDougall
Subject: Re: Replication factor affecting write performance

 

What's the network setup and topology?

Also, the size of the cluster?

 

On Mon, Sep 1, 2014 at 4:10 PM, Laurens Bronwasser <La...@imc.nl> wrote:

And now with the right label on the Y-axis.

 



 

From: Microsoft Office User <la...@imc.nl>
Date: Monday, September 1, 2014 at 9:56 AM
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Cc: Julien Lehuen <ju...@imc.nl>, Tyler McDougall <Ty...@imc.nl>
Subject: Replication factor affecting write performance

 

Hi,

We have a setup with two clusters. 

On cluster shows a very strong degradation when we increase the replication factor.

Another cluster shows hardly any degradation with increased replication factor.

 

Any idea how to find out the bottleneck in the slower cluster?

 

 

 

  _____  


The information in this e-mail is intended only for the person or entity to which it is addressed.

It may contain confidential and /or privileged material. If someone other than the intended recipient should receive this e-mail, he / she shall not be entitled to read, disseminate, disclose or duplicate it.

If you receive this e-mail unintentionally, please inform us immediately by "reply" and then delete it from your system. Although this information has been compiled with great care, neither IMC Financial Markets & Asset Management nor any of its related entities shall accept any responsibility for any errors, omissions or other inaccuracies in this information or for the consequences thereof, nor shall it be bound in any way by the contents of this e-mail or its attachments. In the event of incomplete or incorrect transmission, please return the e-mail to the sender and permanently delete this message and any attachments.

Messages and attachments are scanned for all known viruses. Always scan attachments before opening them.





 

-- 

Regards,

Stanley Shi,

  <http://www.gopivotal.com/files/media/logos/pivotal-logo-email-signature.png> 


RE: Replication factor affecting write performance

Posted by mike Zarrin <mi...@unitedrmr.com>.
Unsubscribe

 

 

From: Stanley Shi [mailto:sshi@pivotal.io] 
Sent: Monday, September 01, 2014 7:31 PM
To: user@hadoop.apache.org
Cc: Julien Lehuen; Tyler McDougall
Subject: Re: Replication factor affecting write performance

 

What's the network setup and topology?

Also, the size of the cluster?

 

On Mon, Sep 1, 2014 at 4:10 PM, Laurens Bronwasser <La...@imc.nl> wrote:

And now with the right label on the Y-axis.

 



 

From: Microsoft Office User <la...@imc.nl>
Date: Monday, September 1, 2014 at 9:56 AM
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Cc: Julien Lehuen <ju...@imc.nl>, Tyler McDougall <Ty...@imc.nl>
Subject: Replication factor affecting write performance

 

Hi,

We have a setup with two clusters. 

On cluster shows a very strong degradation when we increase the replication factor.

Another cluster shows hardly any degradation with increased replication factor.

 

Any idea how to find out the bottleneck in the slower cluster?

 

 

 

  _____  


The information in this e-mail is intended only for the person or entity to which it is addressed.

It may contain confidential and /or privileged material. If someone other than the intended recipient should receive this e-mail, he / she shall not be entitled to read, disseminate, disclose or duplicate it.

If you receive this e-mail unintentionally, please inform us immediately by "reply" and then delete it from your system. Although this information has been compiled with great care, neither IMC Financial Markets & Asset Management nor any of its related entities shall accept any responsibility for any errors, omissions or other inaccuracies in this information or for the consequences thereof, nor shall it be bound in any way by the contents of this e-mail or its attachments. In the event of incomplete or incorrect transmission, please return the e-mail to the sender and permanently delete this message and any attachments.

Messages and attachments are scanned for all known viruses. Always scan attachments before opening them.





 

-- 

Regards,

Stanley Shi,

  <http://www.gopivotal.com/files/media/logos/pivotal-logo-email-signature.png> 


RE: Replication factor affecting write performance

Posted by Laurens Bronwasser <La...@imc.nl>.
We found that a full java heap PermSpace caused the datanode process to be slow. With the high replication factor, the delay seems to build up with the number of datanodes.
Increasing the PermSpace improved performance significantly and made impala a lot faster as well.
Any other java settings that someone recommends having a look at to improve performance?

Cheers,
Laurens



-----Original Message-----
From: Laurens Bronwasser
Sent: Tuesday, September 02, 2014 03:47 PM W. Europe Standard Time
To: user@hadoop.apache.org
Cc: Julien Lehuen; Tyler McDougall
Subject: Re: Replication factor affecting write performance

Thanks Stanley for responding.

The size of the cluster B is 12 data nodes. Cluster A has 18 (slower, older) data nodes.
The topology is that all data nodes and name nodes are under one switch. The clusters are not sharing a switch though, the cluster are in different data centers.

Perhaps this helps, below is a trace of the combined logs of all data nodes, after writing one block of 500MB with replication factor 12, on Cluster B.
Note that the logs are combined and sorted on millisecond precision, certain lines may have happened in a different order:

2014-09-02 15:18:41,135 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.87:55341 dest: /10.141.0.159:50010 hinterlands.trading.imc.intra
2014-09-02 15:18:41,137 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.159:46759 dest: /10.141.0.168:50010 redridge.trading.imc.intra
2014-09-02 15:18:41,138 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.168:37742 dest: /10.141.0.157:50010 eversong.trading.imc.intra
2014-09-02 15:18:41,139 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.157:36874 dest: /10.141.0.156:50010 elwynn.trading.imc.intra
2014-09-02 15:18:41,140 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.156:43356 dest: /10.141.0.170:50010 searing.trading.imc.intra
2014-09-02 15:18:41,141 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.170:35201 dest: /10.141.0.155:50010 duskwood.trading.imc.intra
2014-09-02 15:18:41,142 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.155:52119 dest: /10.141.0.174:50010 tirisfal.trading.imc.intra
2014-09-02 15:18:41,143 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.174:45775 dest: /10.141.0.173:50010 sorrows.trading.imc.intra
2014-09-02 15:18:41,144 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.173:55624 dest: /10.141.0.171:50010 shimmering.trading.imc.intra
2014-09-02 15:18:41,145 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.171:45532 dest: /10.141.0.172:50010 silverpine.trading.imc.intra
2014-09-02 15:18:41,146 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.172:35460 dest: /10.141.0.158:50010 gilneas.trading.imc.intra
2014-09-02 15:18:41,147 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.158:59490 dest: /10.141.0.169:50010 scarlet.trading.imc.intra
2014-09-02 15:19:11,506 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.158:59490, dest: /10.141.0.169:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: f18f77e8-db9a-4e91-9160-401ae4b2f15b, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30357265647 scarlet.trading.imc.intra
2014-09-02 15:19:11,506 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.172:35460, dest: /10.141.0.158:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 66b3bcb2-36a8-4684-878f-2b90cec9d558, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30357606668 gilneas.trading.imc.intra
2014-09-02 15:19:11,506 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating gilneas.trading.imc.intra
2014-09-02 15:19:11,506 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=LAST_IN_PIPELINE, downstreams=0:[] terminating scarlet.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.171:45532, dest: /10.141.0.172:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 3d587891-c434-4a45-acb1-e41122ab71f9, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30357881405 silverpine.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.173:55624, dest: /10.141.0.171:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 3b0fcf75-78d9-4bcd-b5e0-941a8dd40da8, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30358283146 shimmering.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.174:45775, dest: /10.141.0.173:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 60d4ad85-90d0-42e1-958e-b73ff55a4d2c, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30358543752 sorrows.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating shimmering.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating silverpine.trading.imc.intra
2014-09-02 15:19:11,508 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.155:52119, dest: /10.141.0.174:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 8bc724e7-0f5c-4347-be32-7b9812286c5c, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30358704352 tirisfal.trading.imc.intra
2014-09-02 15:19:11,508 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating sorrows.trading.imc.intra
2014-09-02 15:19:11,508 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating tirisfal.trading.imc.intra
2014-09-02 15:19:11,509 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.156:43356, dest: /10.141.0.170:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 6f0e01d4-81a0-4289-b21a-4ca03bc6a191, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30359566398 searing.trading.imc.intra
2014-09-02 15:19:11,509 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.170:35201, dest: /10.141.0.155:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 08b14350-3484-4a0e-8520-8167733c62bd, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30359128561 duskwood.trading.imc.intra
2014-09-02 15:19:11,509 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating duskwood.trading.imc.intra
2014-09-02 15:19:11,509 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating searing.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.157:36874, dest: /10.141.0.156:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: ddcdea62-71e0-4279-a96c-aeea90580249, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30359733153 elwynn.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.159:46759, dest: /10.141.0.168:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: c953d2ac-57ba-47a4-ad49-24161a18770b, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30360356635 redridge.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.168:37742, dest: /10.141.0.157:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: ab89db9e-c99c-4a6e-9e87-cec30626cef9, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30360119140 eversong.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating elwynn.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating eversong.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating redridge.trading.imc.intra
2014-09-02 15:19:11,511 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.87:55341, dest: /10.141.0.159:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 9ba59389-bce8-4cf0-815c-b96a20699d13, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30328724294 hinterlands.trading.imc.intra
2014-09-02 15:19:11,511 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating hinterlands.trading.imc.intra




From: Stanley Shi <ss...@pivotal.io>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Tuesday, September 2, 2014 at 4:30 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Cc: Julien Lehuen <ju...@imc.nl>>, Tyler McDougall <Ty...@imc.nl>>
Subject: Re: Replication factor affecting write performance

What's the network setup and topology?
Also, the size of the cluster?


On Mon, Sep 1, 2014 at 4:10 PM, Laurens Bronwasser <La...@imc.nl>> wrote:
And now with the right label on the Y-axis.

[cid:7D64F387-C6F1-4B37-9894-0166EC949EF9]

From: Microsoft Office User <la...@imc.nl>>
Date: Monday, September 1, 2014 at 9:56 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Cc: Julien Lehuen <ju...@imc.nl>>, Tyler McDougall <Ty...@imc.nl>>
Subject: Replication factor affecting write performance

Hi,
We have a setup with two clusters.
On cluster shows a very strong degradation when we increase the replication factor.
Another cluster shows hardly any degradation with increased replication factor.

Any idea how to find out the bottleneck in the slower cluster?



________________________________

The information in this e-mail is intended only for the person or entity to which it is addressed.

It may contain confidential and /or privileged material. If someone other than the intended recipient should receive this e-mail, he / she shall not be entitled to read, disseminate, disclose or duplicate it.

If you receive this e-mail unintentionally, please inform us immediately by "reply" and then delete it from your system. Although this information has been compiled with great care, neither IMC Financial Markets & Asset Management nor any of its related entities shall accept any responsibility for any errors, omissions or other inaccuracies in this information or for the consequences thereof, nor shall it be bound in any way by the contents of this e-mail or its attachments. In the event of incomplete or incorrect transmission, please return the e-mail to the sender and permanently delete this message and any attachments.

Messages and attachments are scanned for all known viruses. Always scan attachments before opening them.



--
Regards,
Stanley Shi,
[http://www.gopivotal.com/files/media/logos/pivotal-logo-email-signature.png]

________________________________

The information in this e-mail is intended only for the person or entity to which it is addressed.

It may contain confidential and /or privileged material. If someone other than the intended recipient should receive this e-mail, he / she shall not be entitled to read, disseminate, disclose or duplicate it.

If you receive this e-mail unintentionally, please inform us immediately by "reply" and then delete it from your system. Although this information has been compiled with great care, neither IMC Financial Markets & Asset Management nor any of its related entities shall accept any responsibility for any errors, omissions or other inaccuracies in this information or for the consequences thereof, nor shall it be bound in any way by the contents of this e-mail or its attachments. In the event of incomplete or incorrect transmission, please return the e-mail to the sender and permanently delete this message and any attachments.

Messages and attachments are scanned for all known viruses. Always scan attachments before opening them.

RE: Replication factor affecting write performance

Posted by Laurens Bronwasser <La...@imc.nl>.
We found that a full java heap PermSpace caused the datanode process to be slow. With the high replication factor, the delay seems to build up with the number of datanodes.
Increasing the PermSpace improved performance significantly and made impala a lot faster as well.
Any other java settings that someone recommends having a look at to improve performance?

Cheers,
Laurens



-----Original Message-----
From: Laurens Bronwasser
Sent: Tuesday, September 02, 2014 03:47 PM W. Europe Standard Time
To: user@hadoop.apache.org
Cc: Julien Lehuen; Tyler McDougall
Subject: Re: Replication factor affecting write performance

Thanks Stanley for responding.

The size of the cluster B is 12 data nodes. Cluster A has 18 (slower, older) data nodes.
The topology is that all data nodes and name nodes are under one switch. The clusters are not sharing a switch though, the cluster are in different data centers.

Perhaps this helps, below is a trace of the combined logs of all data nodes, after writing one block of 500MB with replication factor 12, on Cluster B.
Note that the logs are combined and sorted on millisecond precision, certain lines may have happened in a different order:

2014-09-02 15:18:41,135 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.87:55341 dest: /10.141.0.159:50010 hinterlands.trading.imc.intra
2014-09-02 15:18:41,137 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.159:46759 dest: /10.141.0.168:50010 redridge.trading.imc.intra
2014-09-02 15:18:41,138 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.168:37742 dest: /10.141.0.157:50010 eversong.trading.imc.intra
2014-09-02 15:18:41,139 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.157:36874 dest: /10.141.0.156:50010 elwynn.trading.imc.intra
2014-09-02 15:18:41,140 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.156:43356 dest: /10.141.0.170:50010 searing.trading.imc.intra
2014-09-02 15:18:41,141 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.170:35201 dest: /10.141.0.155:50010 duskwood.trading.imc.intra
2014-09-02 15:18:41,142 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.155:52119 dest: /10.141.0.174:50010 tirisfal.trading.imc.intra
2014-09-02 15:18:41,143 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.174:45775 dest: /10.141.0.173:50010 sorrows.trading.imc.intra
2014-09-02 15:18:41,144 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.173:55624 dest: /10.141.0.171:50010 shimmering.trading.imc.intra
2014-09-02 15:18:41,145 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.171:45532 dest: /10.141.0.172:50010 silverpine.trading.imc.intra
2014-09-02 15:18:41,146 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.172:35460 dest: /10.141.0.158:50010 gilneas.trading.imc.intra
2014-09-02 15:18:41,147 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.158:59490 dest: /10.141.0.169:50010 scarlet.trading.imc.intra
2014-09-02 15:19:11,506 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.158:59490, dest: /10.141.0.169:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: f18f77e8-db9a-4e91-9160-401ae4b2f15b, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30357265647 scarlet.trading.imc.intra
2014-09-02 15:19:11,506 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.172:35460, dest: /10.141.0.158:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 66b3bcb2-36a8-4684-878f-2b90cec9d558, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30357606668 gilneas.trading.imc.intra
2014-09-02 15:19:11,506 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating gilneas.trading.imc.intra
2014-09-02 15:19:11,506 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=LAST_IN_PIPELINE, downstreams=0:[] terminating scarlet.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.171:45532, dest: /10.141.0.172:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 3d587891-c434-4a45-acb1-e41122ab71f9, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30357881405 silverpine.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.173:55624, dest: /10.141.0.171:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 3b0fcf75-78d9-4bcd-b5e0-941a8dd40da8, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30358283146 shimmering.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.174:45775, dest: /10.141.0.173:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 60d4ad85-90d0-42e1-958e-b73ff55a4d2c, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30358543752 sorrows.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating shimmering.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating silverpine.trading.imc.intra
2014-09-02 15:19:11,508 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.155:52119, dest: /10.141.0.174:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 8bc724e7-0f5c-4347-be32-7b9812286c5c, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30358704352 tirisfal.trading.imc.intra
2014-09-02 15:19:11,508 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating sorrows.trading.imc.intra
2014-09-02 15:19:11,508 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating tirisfal.trading.imc.intra
2014-09-02 15:19:11,509 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.156:43356, dest: /10.141.0.170:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 6f0e01d4-81a0-4289-b21a-4ca03bc6a191, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30359566398 searing.trading.imc.intra
2014-09-02 15:19:11,509 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.170:35201, dest: /10.141.0.155:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 08b14350-3484-4a0e-8520-8167733c62bd, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30359128561 duskwood.trading.imc.intra
2014-09-02 15:19:11,509 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating duskwood.trading.imc.intra
2014-09-02 15:19:11,509 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating searing.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.157:36874, dest: /10.141.0.156:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: ddcdea62-71e0-4279-a96c-aeea90580249, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30359733153 elwynn.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.159:46759, dest: /10.141.0.168:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: c953d2ac-57ba-47a4-ad49-24161a18770b, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30360356635 redridge.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.168:37742, dest: /10.141.0.157:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: ab89db9e-c99c-4a6e-9e87-cec30626cef9, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30360119140 eversong.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating elwynn.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating eversong.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating redridge.trading.imc.intra
2014-09-02 15:19:11,511 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.87:55341, dest: /10.141.0.159:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 9ba59389-bce8-4cf0-815c-b96a20699d13, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30328724294 hinterlands.trading.imc.intra
2014-09-02 15:19:11,511 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating hinterlands.trading.imc.intra




From: Stanley Shi <ss...@pivotal.io>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Tuesday, September 2, 2014 at 4:30 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Cc: Julien Lehuen <ju...@imc.nl>>, Tyler McDougall <Ty...@imc.nl>>
Subject: Re: Replication factor affecting write performance

What's the network setup and topology?
Also, the size of the cluster?


On Mon, Sep 1, 2014 at 4:10 PM, Laurens Bronwasser <La...@imc.nl>> wrote:
And now with the right label on the Y-axis.

[cid:7D64F387-C6F1-4B37-9894-0166EC949EF9]

From: Microsoft Office User <la...@imc.nl>>
Date: Monday, September 1, 2014 at 9:56 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Cc: Julien Lehuen <ju...@imc.nl>>, Tyler McDougall <Ty...@imc.nl>>
Subject: Replication factor affecting write performance

Hi,
We have a setup with two clusters.
On cluster shows a very strong degradation when we increase the replication factor.
Another cluster shows hardly any degradation with increased replication factor.

Any idea how to find out the bottleneck in the slower cluster?



________________________________

The information in this e-mail is intended only for the person or entity to which it is addressed.

It may contain confidential and /or privileged material. If someone other than the intended recipient should receive this e-mail, he / she shall not be entitled to read, disseminate, disclose or duplicate it.

If you receive this e-mail unintentionally, please inform us immediately by "reply" and then delete it from your system. Although this information has been compiled with great care, neither IMC Financial Markets & Asset Management nor any of its related entities shall accept any responsibility for any errors, omissions or other inaccuracies in this information or for the consequences thereof, nor shall it be bound in any way by the contents of this e-mail or its attachments. In the event of incomplete or incorrect transmission, please return the e-mail to the sender and permanently delete this message and any attachments.

Messages and attachments are scanned for all known viruses. Always scan attachments before opening them.



--
Regards,
Stanley Shi,
[http://www.gopivotal.com/files/media/logos/pivotal-logo-email-signature.png]

________________________________

The information in this e-mail is intended only for the person or entity to which it is addressed.

It may contain confidential and /or privileged material. If someone other than the intended recipient should receive this e-mail, he / she shall not be entitled to read, disseminate, disclose or duplicate it.

If you receive this e-mail unintentionally, please inform us immediately by "reply" and then delete it from your system. Although this information has been compiled with great care, neither IMC Financial Markets & Asset Management nor any of its related entities shall accept any responsibility for any errors, omissions or other inaccuracies in this information or for the consequences thereof, nor shall it be bound in any way by the contents of this e-mail or its attachments. In the event of incomplete or incorrect transmission, please return the e-mail to the sender and permanently delete this message and any attachments.

Messages and attachments are scanned for all known viruses. Always scan attachments before opening them.

RE: Replication factor affecting write performance

Posted by Laurens Bronwasser <La...@imc.nl>.
We found that a full java heap PermSpace caused the datanode process to be slow. With the high replication factor, the delay seems to build up with the number of datanodes.
Increasing the PermSpace improved performance significantly and made impala a lot faster as well.
Any other java settings that someone recommends having a look at to improve performance?

Cheers,
Laurens



-----Original Message-----
From: Laurens Bronwasser
Sent: Tuesday, September 02, 2014 03:47 PM W. Europe Standard Time
To: user@hadoop.apache.org
Cc: Julien Lehuen; Tyler McDougall
Subject: Re: Replication factor affecting write performance

Thanks Stanley for responding.

The size of the cluster B is 12 data nodes. Cluster A has 18 (slower, older) data nodes.
The topology is that all data nodes and name nodes are under one switch. The clusters are not sharing a switch though, the cluster are in different data centers.

Perhaps this helps, below is a trace of the combined logs of all data nodes, after writing one block of 500MB with replication factor 12, on Cluster B.
Note that the logs are combined and sorted on millisecond precision, certain lines may have happened in a different order:

2014-09-02 15:18:41,135 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.87:55341 dest: /10.141.0.159:50010 hinterlands.trading.imc.intra
2014-09-02 15:18:41,137 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.159:46759 dest: /10.141.0.168:50010 redridge.trading.imc.intra
2014-09-02 15:18:41,138 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.168:37742 dest: /10.141.0.157:50010 eversong.trading.imc.intra
2014-09-02 15:18:41,139 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.157:36874 dest: /10.141.0.156:50010 elwynn.trading.imc.intra
2014-09-02 15:18:41,140 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.156:43356 dest: /10.141.0.170:50010 searing.trading.imc.intra
2014-09-02 15:18:41,141 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.170:35201 dest: /10.141.0.155:50010 duskwood.trading.imc.intra
2014-09-02 15:18:41,142 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.155:52119 dest: /10.141.0.174:50010 tirisfal.trading.imc.intra
2014-09-02 15:18:41,143 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.174:45775 dest: /10.141.0.173:50010 sorrows.trading.imc.intra
2014-09-02 15:18:41,144 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.173:55624 dest: /10.141.0.171:50010 shimmering.trading.imc.intra
2014-09-02 15:18:41,145 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.171:45532 dest: /10.141.0.172:50010 silverpine.trading.imc.intra
2014-09-02 15:18:41,146 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.172:35460 dest: /10.141.0.158:50010 gilneas.trading.imc.intra
2014-09-02 15:18:41,147 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.158:59490 dest: /10.141.0.169:50010 scarlet.trading.imc.intra
2014-09-02 15:19:11,506 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.158:59490, dest: /10.141.0.169:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: f18f77e8-db9a-4e91-9160-401ae4b2f15b, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30357265647 scarlet.trading.imc.intra
2014-09-02 15:19:11,506 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.172:35460, dest: /10.141.0.158:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 66b3bcb2-36a8-4684-878f-2b90cec9d558, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30357606668 gilneas.trading.imc.intra
2014-09-02 15:19:11,506 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating gilneas.trading.imc.intra
2014-09-02 15:19:11,506 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=LAST_IN_PIPELINE, downstreams=0:[] terminating scarlet.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.171:45532, dest: /10.141.0.172:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 3d587891-c434-4a45-acb1-e41122ab71f9, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30357881405 silverpine.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.173:55624, dest: /10.141.0.171:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 3b0fcf75-78d9-4bcd-b5e0-941a8dd40da8, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30358283146 shimmering.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.174:45775, dest: /10.141.0.173:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 60d4ad85-90d0-42e1-958e-b73ff55a4d2c, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30358543752 sorrows.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating shimmering.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating silverpine.trading.imc.intra
2014-09-02 15:19:11,508 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.155:52119, dest: /10.141.0.174:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 8bc724e7-0f5c-4347-be32-7b9812286c5c, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30358704352 tirisfal.trading.imc.intra
2014-09-02 15:19:11,508 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating sorrows.trading.imc.intra
2014-09-02 15:19:11,508 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating tirisfal.trading.imc.intra
2014-09-02 15:19:11,509 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.156:43356, dest: /10.141.0.170:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 6f0e01d4-81a0-4289-b21a-4ca03bc6a191, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30359566398 searing.trading.imc.intra
2014-09-02 15:19:11,509 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.170:35201, dest: /10.141.0.155:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 08b14350-3484-4a0e-8520-8167733c62bd, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30359128561 duskwood.trading.imc.intra
2014-09-02 15:19:11,509 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating duskwood.trading.imc.intra
2014-09-02 15:19:11,509 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating searing.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.157:36874, dest: /10.141.0.156:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: ddcdea62-71e0-4279-a96c-aeea90580249, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30359733153 elwynn.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.159:46759, dest: /10.141.0.168:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: c953d2ac-57ba-47a4-ad49-24161a18770b, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30360356635 redridge.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.168:37742, dest: /10.141.0.157:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: ab89db9e-c99c-4a6e-9e87-cec30626cef9, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30360119140 eversong.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating elwynn.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating eversong.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating redridge.trading.imc.intra
2014-09-02 15:19:11,511 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.87:55341, dest: /10.141.0.159:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 9ba59389-bce8-4cf0-815c-b96a20699d13, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30328724294 hinterlands.trading.imc.intra
2014-09-02 15:19:11,511 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating hinterlands.trading.imc.intra




From: Stanley Shi <ss...@pivotal.io>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Tuesday, September 2, 2014 at 4:30 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Cc: Julien Lehuen <ju...@imc.nl>>, Tyler McDougall <Ty...@imc.nl>>
Subject: Re: Replication factor affecting write performance

What's the network setup and topology?
Also, the size of the cluster?


On Mon, Sep 1, 2014 at 4:10 PM, Laurens Bronwasser <La...@imc.nl>> wrote:
And now with the right label on the Y-axis.

[cid:7D64F387-C6F1-4B37-9894-0166EC949EF9]

From: Microsoft Office User <la...@imc.nl>>
Date: Monday, September 1, 2014 at 9:56 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Cc: Julien Lehuen <ju...@imc.nl>>, Tyler McDougall <Ty...@imc.nl>>
Subject: Replication factor affecting write performance

Hi,
We have a setup with two clusters.
On cluster shows a very strong degradation when we increase the replication factor.
Another cluster shows hardly any degradation with increased replication factor.

Any idea how to find out the bottleneck in the slower cluster?



________________________________

The information in this e-mail is intended only for the person or entity to which it is addressed.

It may contain confidential and /or privileged material. If someone other than the intended recipient should receive this e-mail, he / she shall not be entitled to read, disseminate, disclose or duplicate it.

If you receive this e-mail unintentionally, please inform us immediately by "reply" and then delete it from your system. Although this information has been compiled with great care, neither IMC Financial Markets & Asset Management nor any of its related entities shall accept any responsibility for any errors, omissions or other inaccuracies in this information or for the consequences thereof, nor shall it be bound in any way by the contents of this e-mail or its attachments. In the event of incomplete or incorrect transmission, please return the e-mail to the sender and permanently delete this message and any attachments.

Messages and attachments are scanned for all known viruses. Always scan attachments before opening them.



--
Regards,
Stanley Shi,
[http://www.gopivotal.com/files/media/logos/pivotal-logo-email-signature.png]

________________________________

The information in this e-mail is intended only for the person or entity to which it is addressed.

It may contain confidential and /or privileged material. If someone other than the intended recipient should receive this e-mail, he / she shall not be entitled to read, disseminate, disclose or duplicate it.

If you receive this e-mail unintentionally, please inform us immediately by "reply" and then delete it from your system. Although this information has been compiled with great care, neither IMC Financial Markets & Asset Management nor any of its related entities shall accept any responsibility for any errors, omissions or other inaccuracies in this information or for the consequences thereof, nor shall it be bound in any way by the contents of this e-mail or its attachments. In the event of incomplete or incorrect transmission, please return the e-mail to the sender and permanently delete this message and any attachments.

Messages and attachments are scanned for all known viruses. Always scan attachments before opening them.

RE: Replication factor affecting write performance

Posted by Laurens Bronwasser <La...@imc.nl>.
We found that a full java heap PermSpace caused the datanode process to be slow. With the high replication factor, the delay seems to build up with the number of datanodes.
Increasing the PermSpace improved performance significantly and made impala a lot faster as well.
Any other java settings that someone recommends having a look at to improve performance?

Cheers,
Laurens



-----Original Message-----
From: Laurens Bronwasser
Sent: Tuesday, September 02, 2014 03:47 PM W. Europe Standard Time
To: user@hadoop.apache.org
Cc: Julien Lehuen; Tyler McDougall
Subject: Re: Replication factor affecting write performance

Thanks Stanley for responding.

The size of the cluster B is 12 data nodes. Cluster A has 18 (slower, older) data nodes.
The topology is that all data nodes and name nodes are under one switch. The clusters are not sharing a switch though, the cluster are in different data centers.

Perhaps this helps, below is a trace of the combined logs of all data nodes, after writing one block of 500MB with replication factor 12, on Cluster B.
Note that the logs are combined and sorted on millisecond precision, certain lines may have happened in a different order:

2014-09-02 15:18:41,135 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.87:55341 dest: /10.141.0.159:50010 hinterlands.trading.imc.intra
2014-09-02 15:18:41,137 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.159:46759 dest: /10.141.0.168:50010 redridge.trading.imc.intra
2014-09-02 15:18:41,138 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.168:37742 dest: /10.141.0.157:50010 eversong.trading.imc.intra
2014-09-02 15:18:41,139 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.157:36874 dest: /10.141.0.156:50010 elwynn.trading.imc.intra
2014-09-02 15:18:41,140 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.156:43356 dest: /10.141.0.170:50010 searing.trading.imc.intra
2014-09-02 15:18:41,141 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.170:35201 dest: /10.141.0.155:50010 duskwood.trading.imc.intra
2014-09-02 15:18:41,142 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.155:52119 dest: /10.141.0.174:50010 tirisfal.trading.imc.intra
2014-09-02 15:18:41,143 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.174:45775 dest: /10.141.0.173:50010 sorrows.trading.imc.intra
2014-09-02 15:18:41,144 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.173:55624 dest: /10.141.0.171:50010 shimmering.trading.imc.intra
2014-09-02 15:18:41,145 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.171:45532 dest: /10.141.0.172:50010 silverpine.trading.imc.intra
2014-09-02 15:18:41,146 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.172:35460 dest: /10.141.0.158:50010 gilneas.trading.imc.intra
2014-09-02 15:18:41,147 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.158:59490 dest: /10.141.0.169:50010 scarlet.trading.imc.intra
2014-09-02 15:19:11,506 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.158:59490, dest: /10.141.0.169:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: f18f77e8-db9a-4e91-9160-401ae4b2f15b, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30357265647 scarlet.trading.imc.intra
2014-09-02 15:19:11,506 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.172:35460, dest: /10.141.0.158:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 66b3bcb2-36a8-4684-878f-2b90cec9d558, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30357606668 gilneas.trading.imc.intra
2014-09-02 15:19:11,506 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating gilneas.trading.imc.intra
2014-09-02 15:19:11,506 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=LAST_IN_PIPELINE, downstreams=0:[] terminating scarlet.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.171:45532, dest: /10.141.0.172:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 3d587891-c434-4a45-acb1-e41122ab71f9, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30357881405 silverpine.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.173:55624, dest: /10.141.0.171:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 3b0fcf75-78d9-4bcd-b5e0-941a8dd40da8, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30358283146 shimmering.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.174:45775, dest: /10.141.0.173:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 60d4ad85-90d0-42e1-958e-b73ff55a4d2c, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30358543752 sorrows.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating shimmering.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating silverpine.trading.imc.intra
2014-09-02 15:19:11,508 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.155:52119, dest: /10.141.0.174:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 8bc724e7-0f5c-4347-be32-7b9812286c5c, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30358704352 tirisfal.trading.imc.intra
2014-09-02 15:19:11,508 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating sorrows.trading.imc.intra
2014-09-02 15:19:11,508 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating tirisfal.trading.imc.intra
2014-09-02 15:19:11,509 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.156:43356, dest: /10.141.0.170:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 6f0e01d4-81a0-4289-b21a-4ca03bc6a191, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30359566398 searing.trading.imc.intra
2014-09-02 15:19:11,509 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.170:35201, dest: /10.141.0.155:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 08b14350-3484-4a0e-8520-8167733c62bd, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30359128561 duskwood.trading.imc.intra
2014-09-02 15:19:11,509 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating duskwood.trading.imc.intra
2014-09-02 15:19:11,509 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating searing.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.157:36874, dest: /10.141.0.156:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: ddcdea62-71e0-4279-a96c-aeea90580249, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30359733153 elwynn.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.159:46759, dest: /10.141.0.168:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: c953d2ac-57ba-47a4-ad49-24161a18770b, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30360356635 redridge.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.168:37742, dest: /10.141.0.157:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: ab89db9e-c99c-4a6e-9e87-cec30626cef9, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30360119140 eversong.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating elwynn.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating eversong.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating redridge.trading.imc.intra
2014-09-02 15:19:11,511 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.87:55341, dest: /10.141.0.159:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 9ba59389-bce8-4cf0-815c-b96a20699d13, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30328724294 hinterlands.trading.imc.intra
2014-09-02 15:19:11,511 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating hinterlands.trading.imc.intra




From: Stanley Shi <ss...@pivotal.io>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Tuesday, September 2, 2014 at 4:30 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Cc: Julien Lehuen <ju...@imc.nl>>, Tyler McDougall <Ty...@imc.nl>>
Subject: Re: Replication factor affecting write performance

What's the network setup and topology?
Also, the size of the cluster?


On Mon, Sep 1, 2014 at 4:10 PM, Laurens Bronwasser <La...@imc.nl>> wrote:
And now with the right label on the Y-axis.

[cid:7D64F387-C6F1-4B37-9894-0166EC949EF9]

From: Microsoft Office User <la...@imc.nl>>
Date: Monday, September 1, 2014 at 9:56 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Cc: Julien Lehuen <ju...@imc.nl>>, Tyler McDougall <Ty...@imc.nl>>
Subject: Replication factor affecting write performance

Hi,
We have a setup with two clusters.
On cluster shows a very strong degradation when we increase the replication factor.
Another cluster shows hardly any degradation with increased replication factor.

Any idea how to find out the bottleneck in the slower cluster?



________________________________

The information in this e-mail is intended only for the person or entity to which it is addressed.

It may contain confidential and /or privileged material. If someone other than the intended recipient should receive this e-mail, he / she shall not be entitled to read, disseminate, disclose or duplicate it.

If you receive this e-mail unintentionally, please inform us immediately by "reply" and then delete it from your system. Although this information has been compiled with great care, neither IMC Financial Markets & Asset Management nor any of its related entities shall accept any responsibility for any errors, omissions or other inaccuracies in this information or for the consequences thereof, nor shall it be bound in any way by the contents of this e-mail or its attachments. In the event of incomplete or incorrect transmission, please return the e-mail to the sender and permanently delete this message and any attachments.

Messages and attachments are scanned for all known viruses. Always scan attachments before opening them.



--
Regards,
Stanley Shi,
[http://www.gopivotal.com/files/media/logos/pivotal-logo-email-signature.png]

________________________________

The information in this e-mail is intended only for the person or entity to which it is addressed.

It may contain confidential and /or privileged material. If someone other than the intended recipient should receive this e-mail, he / she shall not be entitled to read, disseminate, disclose or duplicate it.

If you receive this e-mail unintentionally, please inform us immediately by "reply" and then delete it from your system. Although this information has been compiled with great care, neither IMC Financial Markets & Asset Management nor any of its related entities shall accept any responsibility for any errors, omissions or other inaccuracies in this information or for the consequences thereof, nor shall it be bound in any way by the contents of this e-mail or its attachments. In the event of incomplete or incorrect transmission, please return the e-mail to the sender and permanently delete this message and any attachments.

Messages and attachments are scanned for all known viruses. Always scan attachments before opening them.

Re: Replication factor affecting write performance

Posted by Dieter De Witte <dr...@gmail.com>.
Isn't it logical that if your replication factor -> number of data nodes
that your performance will drop? Probably if repl factor approaches 18 for
cluster A the write performance will drop too. The more nodes your hdfs can
choose to write the data to the likelier the write will be successful and
efficient (no delays). If you need all your nodes and do many writes then
the writing process will slow down since write2 might have to wait for some
write1s..


2014-09-02 15:47 GMT+02:00 Laurens Bronwasser <La...@imc.nl>:

>  Thanks Stanley for responding.
>
>  The size of the cluster B is 12 data nodes. Cluster A has 18 (slower,
> older) data nodes.
> The topology is that all data nodes and name nodes are under one switch.
> The clusters are not sharing a switch though, the cluster are in different
> data centers.
>
>  Perhaps this helps, below is a trace of the combined logs of all data
> nodes, after writing one block of 500MB with replication factor 12, on
> Cluster B.
> Note that the logs are combined and sorted on millisecond precision,
> certain lines may have happened in a different order:
>
>  2014-09-02 15:18:41,135 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.87:55341 dest: /10.141.0.159:50010 hinterlands.trading.imc.intra
> 2014-09-02 15:18:41,137 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.159:46759 dest: /10.141.0.168:50010 redridge.trading.imc.intra
> 2014-09-02 15:18:41,138 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.168:37742 dest: /10.141.0.157:50010 eversong.trading.imc.intra
> 2014-09-02 15:18:41,139 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.157:36874 dest: /10.141.0.156:50010 elwynn.trading.imc.intra
> 2014-09-02 15:18:41,140 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.156:43356 dest: /10.141.0.170:50010 searing.trading.imc.intra
> 2014-09-02 15:18:41,141 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.170:35201 dest: /10.141.0.155:50010 duskwood.trading.imc.intra
> 2014-09-02 15:18:41,142 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.155:52119 dest: /10.141.0.174:50010 tirisfal.trading.imc.intra
> 2014-09-02 15:18:41,143 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.174:45775 dest: /10.141.0.173:50010 sorrows.trading.imc.intra
> 2014-09-02 15:18:41,144 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.173:55624 dest: /10.141.0.171:50010 shimmering.trading.imc.intra
> 2014-09-02 15:18:41,145 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.171:45532 dest: /10.141.0.172:50010 silverpine.trading.imc.intra
> 2014-09-02 15:18:41,146 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.172:35460 dest: /10.141.0.158:50010 gilneas.trading.imc.intra
> 2014-09-02 15:18:41,147 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.158:59490 dest: /10.141.0.169:50010 scarlet.trading.imc.intra
> 2014-09-02 15:19:11,506 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.158:59490, dest: /10.141.0.169:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> f18f77e8-db9a-4e91-9160-401ae4b2f15b, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30357265647 scarlet.trading.imc.intra
> 2014-09-02 15:19:11,506 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.172:35460, dest: /10.141.0.158:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> 66b3bcb2-36a8-4684-878f-2b90cec9d558, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30357606668 gilneas.trading.imc.intra
> 2014-09-02 15:19:11,506 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating gilneas.trading.imc.intra
> 2014-09-02 15:19:11,506 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=LAST_IN_PIPELINE, downstreams=0:[] terminating
> scarlet.trading.imc.intra
> 2014-09-02 15:19:11,507 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.171:45532, dest: /10.141.0.172:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> 3d587891-c434-4a45-acb1-e41122ab71f9, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30357881405 silverpine.trading.imc.intra
> 2014-09-02 15:19:11,507 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.173:55624, dest: /10.141.0.171:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> 3b0fcf75-78d9-4bcd-b5e0-941a8dd40da8, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30358283146 shimmering.trading.imc.intra
> 2014-09-02 15:19:11,507 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.174:45775, dest: /10.141.0.173:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> 60d4ad85-90d0-42e1-958e-b73ff55a4d2c, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30358543752 sorrows.trading.imc.intra
> 2014-09-02 15:19:11,507 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating shimmering.trading.imc.intra
> 2014-09-02 15:19:11,507 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating silverpine.trading.imc.intra
> 2014-09-02 15:19:11,508 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.155:52119, dest: /10.141.0.174:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> 8bc724e7-0f5c-4347-be32-7b9812286c5c, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30358704352 tirisfal.trading.imc.intra
> 2014-09-02 15:19:11,508 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating sorrows.trading.imc.intra
> 2014-09-02 15:19:11,508 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating tirisfal.trading.imc.intra
> 2014-09-02 15:19:11,509 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.156:43356, dest: /10.141.0.170:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> 6f0e01d4-81a0-4289-b21a-4ca03bc6a191, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30359566398 searing.trading.imc.intra
> 2014-09-02 15:19:11,509 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.170:35201, dest: /10.141.0.155:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> 08b14350-3484-4a0e-8520-8167733c62bd, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30359128561 duskwood.trading.imc.intra
> 2014-09-02 15:19:11,509 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating duskwood.trading.imc.intra
> 2014-09-02 15:19:11,509 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating searing.trading.imc.intra
> 2014-09-02 15:19:11,510 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.157:36874, dest: /10.141.0.156:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> ddcdea62-71e0-4279-a96c-aeea90580249, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30359733153 elwynn.trading.imc.intra
> 2014-09-02 15:19:11,510 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.159:46759, dest: /10.141.0.168:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> c953d2ac-57ba-47a4-ad49-24161a18770b, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30360356635 redridge.trading.imc.intra
> 2014-09-02 15:19:11,510 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.168:37742, dest: /10.141.0.157:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> ab89db9e-c99c-4a6e-9e87-cec30626cef9, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30360119140 eversong.trading.imc.intra
> 2014-09-02 15:19:11,510 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating elwynn.trading.imc.intra
> 2014-09-02 15:19:11,510 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating eversong.trading.imc.intra
> 2014-09-02 15:19:11,510 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating redridge.trading.imc.intra
> 2014-09-02 15:19:11,511 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.87:55341, dest: /10.141.0.159:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> 9ba59389-bce8-4cf0-815c-b96a20699d13, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30328724294 hinterlands.trading.imc.intra
> 2014-09-02 15:19:11,511 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating hinterlands.trading.imc.intra
>
>
>
>
>   From: Stanley Shi <ss...@pivotal.io>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Tuesday, September 2, 2014 at 4:30 AM
>
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Cc: Julien Lehuen <ju...@imc.nl>, Tyler McDougall <
> Tyler.McDougall@imc.nl>
> Subject: Re: Replication factor affecting write performance
>
>   What's the network setup and topology?
> Also, the size of the cluster?
>
>
> On Mon, Sep 1, 2014 at 4:10 PM, Laurens Bronwasser <
> Laurens.Bronwasser@imc.nl> wrote:
>
>>  And now with the right label on the Y-axis.
>>
>>
>>   From: Microsoft Office User <la...@imc.nl>
>> Date: Monday, September 1, 2014 at 9:56 AM
>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Cc: Julien Lehuen <ju...@imc.nl>, Tyler McDougall <
>> Tyler.McDougall@imc.nl>
>> Subject: Replication factor affecting write performance
>>
>>   Hi,
>> We have a setup with two clusters.
>> On cluster shows a very strong degradation when we increase the
>> replication factor.
>> Another cluster shows hardly any degradation with increased replication
>> factor.
>>
>>  Any idea how to find out the bottleneck in the slower cluster?
>>
>>
>>
>> ------------------------------
>>
>> The information in this e-mail is intended only for the person or entity
>> to which it is addressed.
>>
>> It may contain confidential and /or privileged material. If someone other
>> than the intended recipient should receive this e-mail, he / she shall not
>> be entitled to read, disseminate, disclose or duplicate it.
>>
>> If you receive this e-mail unintentionally, please inform us immediately
>> by "reply" and then delete it from your system. Although this information
>> has been compiled with great care, neither IMC Financial Markets & Asset
>> Management nor any of its related entities shall accept any responsibility
>> for any errors, omissions or other inaccuracies in this information or for
>> the consequences thereof, nor shall it be bound in any way by the contents
>> of this e-mail or its attachments. In the event of incomplete or incorrect
>> transmission, please return the e-mail to the sender and permanently delete
>> this message and any attachments.
>>
>> Messages and attachments are scanned for all known viruses. Always scan
>> attachments before opening them.
>>
>
>
>
>  --
>  Regards,
> *Stanley Shi,*
>
>
> ------------------------------
>
> The information in this e-mail is intended only for the person or entity
> to which it is addressed.
>
> It may contain confidential and /or privileged material. If someone other
> than the intended recipient should receive this e-mail, he / she shall not
> be entitled to read, disseminate, disclose or duplicate it.
>
> If you receive this e-mail unintentionally, please inform us immediately
> by "reply" and then delete it from your system. Although this information
> has been compiled with great care, neither IMC Financial Markets & Asset
> Management nor any of its related entities shall accept any responsibility
> for any errors, omissions or other inaccuracies in this information or for
> the consequences thereof, nor shall it be bound in any way by the contents
> of this e-mail or its attachments. In the event of incomplete or incorrect
> transmission, please return the e-mail to the sender and permanently delete
> this message and any attachments.
>
> Messages and attachments are scanned for all known viruses. Always scan
> attachments before opening them.
>

Re: Replication factor affecting write performance

Posted by Dieter De Witte <dr...@gmail.com>.
Isn't it logical that if your replication factor -> number of data nodes
that your performance will drop? Probably if repl factor approaches 18 for
cluster A the write performance will drop too. The more nodes your hdfs can
choose to write the data to the likelier the write will be successful and
efficient (no delays). If you need all your nodes and do many writes then
the writing process will slow down since write2 might have to wait for some
write1s..


2014-09-02 15:47 GMT+02:00 Laurens Bronwasser <La...@imc.nl>:

>  Thanks Stanley for responding.
>
>  The size of the cluster B is 12 data nodes. Cluster A has 18 (slower,
> older) data nodes.
> The topology is that all data nodes and name nodes are under one switch.
> The clusters are not sharing a switch though, the cluster are in different
> data centers.
>
>  Perhaps this helps, below is a trace of the combined logs of all data
> nodes, after writing one block of 500MB with replication factor 12, on
> Cluster B.
> Note that the logs are combined and sorted on millisecond precision,
> certain lines may have happened in a different order:
>
>  2014-09-02 15:18:41,135 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.87:55341 dest: /10.141.0.159:50010 hinterlands.trading.imc.intra
> 2014-09-02 15:18:41,137 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.159:46759 dest: /10.141.0.168:50010 redridge.trading.imc.intra
> 2014-09-02 15:18:41,138 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.168:37742 dest: /10.141.0.157:50010 eversong.trading.imc.intra
> 2014-09-02 15:18:41,139 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.157:36874 dest: /10.141.0.156:50010 elwynn.trading.imc.intra
> 2014-09-02 15:18:41,140 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.156:43356 dest: /10.141.0.170:50010 searing.trading.imc.intra
> 2014-09-02 15:18:41,141 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.170:35201 dest: /10.141.0.155:50010 duskwood.trading.imc.intra
> 2014-09-02 15:18:41,142 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.155:52119 dest: /10.141.0.174:50010 tirisfal.trading.imc.intra
> 2014-09-02 15:18:41,143 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.174:45775 dest: /10.141.0.173:50010 sorrows.trading.imc.intra
> 2014-09-02 15:18:41,144 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.173:55624 dest: /10.141.0.171:50010 shimmering.trading.imc.intra
> 2014-09-02 15:18:41,145 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.171:45532 dest: /10.141.0.172:50010 silverpine.trading.imc.intra
> 2014-09-02 15:18:41,146 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.172:35460 dest: /10.141.0.158:50010 gilneas.trading.imc.intra
> 2014-09-02 15:18:41,147 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.158:59490 dest: /10.141.0.169:50010 scarlet.trading.imc.intra
> 2014-09-02 15:19:11,506 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.158:59490, dest: /10.141.0.169:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> f18f77e8-db9a-4e91-9160-401ae4b2f15b, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30357265647 scarlet.trading.imc.intra
> 2014-09-02 15:19:11,506 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.172:35460, dest: /10.141.0.158:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> 66b3bcb2-36a8-4684-878f-2b90cec9d558, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30357606668 gilneas.trading.imc.intra
> 2014-09-02 15:19:11,506 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating gilneas.trading.imc.intra
> 2014-09-02 15:19:11,506 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=LAST_IN_PIPELINE, downstreams=0:[] terminating
> scarlet.trading.imc.intra
> 2014-09-02 15:19:11,507 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.171:45532, dest: /10.141.0.172:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> 3d587891-c434-4a45-acb1-e41122ab71f9, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30357881405 silverpine.trading.imc.intra
> 2014-09-02 15:19:11,507 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.173:55624, dest: /10.141.0.171:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> 3b0fcf75-78d9-4bcd-b5e0-941a8dd40da8, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30358283146 shimmering.trading.imc.intra
> 2014-09-02 15:19:11,507 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.174:45775, dest: /10.141.0.173:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> 60d4ad85-90d0-42e1-958e-b73ff55a4d2c, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30358543752 sorrows.trading.imc.intra
> 2014-09-02 15:19:11,507 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating shimmering.trading.imc.intra
> 2014-09-02 15:19:11,507 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating silverpine.trading.imc.intra
> 2014-09-02 15:19:11,508 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.155:52119, dest: /10.141.0.174:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> 8bc724e7-0f5c-4347-be32-7b9812286c5c, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30358704352 tirisfal.trading.imc.intra
> 2014-09-02 15:19:11,508 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating sorrows.trading.imc.intra
> 2014-09-02 15:19:11,508 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating tirisfal.trading.imc.intra
> 2014-09-02 15:19:11,509 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.156:43356, dest: /10.141.0.170:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> 6f0e01d4-81a0-4289-b21a-4ca03bc6a191, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30359566398 searing.trading.imc.intra
> 2014-09-02 15:19:11,509 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.170:35201, dest: /10.141.0.155:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> 08b14350-3484-4a0e-8520-8167733c62bd, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30359128561 duskwood.trading.imc.intra
> 2014-09-02 15:19:11,509 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating duskwood.trading.imc.intra
> 2014-09-02 15:19:11,509 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating searing.trading.imc.intra
> 2014-09-02 15:19:11,510 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.157:36874, dest: /10.141.0.156:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> ddcdea62-71e0-4279-a96c-aeea90580249, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30359733153 elwynn.trading.imc.intra
> 2014-09-02 15:19:11,510 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.159:46759, dest: /10.141.0.168:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> c953d2ac-57ba-47a4-ad49-24161a18770b, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30360356635 redridge.trading.imc.intra
> 2014-09-02 15:19:11,510 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.168:37742, dest: /10.141.0.157:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> ab89db9e-c99c-4a6e-9e87-cec30626cef9, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30360119140 eversong.trading.imc.intra
> 2014-09-02 15:19:11,510 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating elwynn.trading.imc.intra
> 2014-09-02 15:19:11,510 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating eversong.trading.imc.intra
> 2014-09-02 15:19:11,510 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating redridge.trading.imc.intra
> 2014-09-02 15:19:11,511 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.87:55341, dest: /10.141.0.159:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> 9ba59389-bce8-4cf0-815c-b96a20699d13, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30328724294 hinterlands.trading.imc.intra
> 2014-09-02 15:19:11,511 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating hinterlands.trading.imc.intra
>
>
>
>
>   From: Stanley Shi <ss...@pivotal.io>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Tuesday, September 2, 2014 at 4:30 AM
>
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Cc: Julien Lehuen <ju...@imc.nl>, Tyler McDougall <
> Tyler.McDougall@imc.nl>
> Subject: Re: Replication factor affecting write performance
>
>   What's the network setup and topology?
> Also, the size of the cluster?
>
>
> On Mon, Sep 1, 2014 at 4:10 PM, Laurens Bronwasser <
> Laurens.Bronwasser@imc.nl> wrote:
>
>>  And now with the right label on the Y-axis.
>>
>>
>>   From: Microsoft Office User <la...@imc.nl>
>> Date: Monday, September 1, 2014 at 9:56 AM
>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Cc: Julien Lehuen <ju...@imc.nl>, Tyler McDougall <
>> Tyler.McDougall@imc.nl>
>> Subject: Replication factor affecting write performance
>>
>>   Hi,
>> We have a setup with two clusters.
>> On cluster shows a very strong degradation when we increase the
>> replication factor.
>> Another cluster shows hardly any degradation with increased replication
>> factor.
>>
>>  Any idea how to find out the bottleneck in the slower cluster?
>>
>>
>>
>> ------------------------------
>>
>> The information in this e-mail is intended only for the person or entity
>> to which it is addressed.
>>
>> It may contain confidential and /or privileged material. If someone other
>> than the intended recipient should receive this e-mail, he / she shall not
>> be entitled to read, disseminate, disclose or duplicate it.
>>
>> If you receive this e-mail unintentionally, please inform us immediately
>> by "reply" and then delete it from your system. Although this information
>> has been compiled with great care, neither IMC Financial Markets & Asset
>> Management nor any of its related entities shall accept any responsibility
>> for any errors, omissions or other inaccuracies in this information or for
>> the consequences thereof, nor shall it be bound in any way by the contents
>> of this e-mail or its attachments. In the event of incomplete or incorrect
>> transmission, please return the e-mail to the sender and permanently delete
>> this message and any attachments.
>>
>> Messages and attachments are scanned for all known viruses. Always scan
>> attachments before opening them.
>>
>
>
>
>  --
>  Regards,
> *Stanley Shi,*
>
>
> ------------------------------
>
> The information in this e-mail is intended only for the person or entity
> to which it is addressed.
>
> It may contain confidential and /or privileged material. If someone other
> than the intended recipient should receive this e-mail, he / she shall not
> be entitled to read, disseminate, disclose or duplicate it.
>
> If you receive this e-mail unintentionally, please inform us immediately
> by "reply" and then delete it from your system. Although this information
> has been compiled with great care, neither IMC Financial Markets & Asset
> Management nor any of its related entities shall accept any responsibility
> for any errors, omissions or other inaccuracies in this information or for
> the consequences thereof, nor shall it be bound in any way by the contents
> of this e-mail or its attachments. In the event of incomplete or incorrect
> transmission, please return the e-mail to the sender and permanently delete
> this message and any attachments.
>
> Messages and attachments are scanned for all known viruses. Always scan
> attachments before opening them.
>

Re: Replication factor affecting write performance

Posted by Dieter De Witte <dr...@gmail.com>.
Isn't it logical that if your replication factor -> number of data nodes
that your performance will drop? Probably if repl factor approaches 18 for
cluster A the write performance will drop too. The more nodes your hdfs can
choose to write the data to the likelier the write will be successful and
efficient (no delays). If you need all your nodes and do many writes then
the writing process will slow down since write2 might have to wait for some
write1s..


2014-09-02 15:47 GMT+02:00 Laurens Bronwasser <La...@imc.nl>:

>  Thanks Stanley for responding.
>
>  The size of the cluster B is 12 data nodes. Cluster A has 18 (slower,
> older) data nodes.
> The topology is that all data nodes and name nodes are under one switch.
> The clusters are not sharing a switch though, the cluster are in different
> data centers.
>
>  Perhaps this helps, below is a trace of the combined logs of all data
> nodes, after writing one block of 500MB with replication factor 12, on
> Cluster B.
> Note that the logs are combined and sorted on millisecond precision,
> certain lines may have happened in a different order:
>
>  2014-09-02 15:18:41,135 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.87:55341 dest: /10.141.0.159:50010 hinterlands.trading.imc.intra
> 2014-09-02 15:18:41,137 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.159:46759 dest: /10.141.0.168:50010 redridge.trading.imc.intra
> 2014-09-02 15:18:41,138 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.168:37742 dest: /10.141.0.157:50010 eversong.trading.imc.intra
> 2014-09-02 15:18:41,139 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.157:36874 dest: /10.141.0.156:50010 elwynn.trading.imc.intra
> 2014-09-02 15:18:41,140 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.156:43356 dest: /10.141.0.170:50010 searing.trading.imc.intra
> 2014-09-02 15:18:41,141 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.170:35201 dest: /10.141.0.155:50010 duskwood.trading.imc.intra
> 2014-09-02 15:18:41,142 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.155:52119 dest: /10.141.0.174:50010 tirisfal.trading.imc.intra
> 2014-09-02 15:18:41,143 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.174:45775 dest: /10.141.0.173:50010 sorrows.trading.imc.intra
> 2014-09-02 15:18:41,144 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.173:55624 dest: /10.141.0.171:50010 shimmering.trading.imc.intra
> 2014-09-02 15:18:41,145 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.171:45532 dest: /10.141.0.172:50010 silverpine.trading.imc.intra
> 2014-09-02 15:18:41,146 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.172:35460 dest: /10.141.0.158:50010 gilneas.trading.imc.intra
> 2014-09-02 15:18:41,147 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.158:59490 dest: /10.141.0.169:50010 scarlet.trading.imc.intra
> 2014-09-02 15:19:11,506 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.158:59490, dest: /10.141.0.169:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> f18f77e8-db9a-4e91-9160-401ae4b2f15b, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30357265647 scarlet.trading.imc.intra
> 2014-09-02 15:19:11,506 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.172:35460, dest: /10.141.0.158:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> 66b3bcb2-36a8-4684-878f-2b90cec9d558, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30357606668 gilneas.trading.imc.intra
> 2014-09-02 15:19:11,506 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating gilneas.trading.imc.intra
> 2014-09-02 15:19:11,506 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=LAST_IN_PIPELINE, downstreams=0:[] terminating
> scarlet.trading.imc.intra
> 2014-09-02 15:19:11,507 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.171:45532, dest: /10.141.0.172:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> 3d587891-c434-4a45-acb1-e41122ab71f9, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30357881405 silverpine.trading.imc.intra
> 2014-09-02 15:19:11,507 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.173:55624, dest: /10.141.0.171:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> 3b0fcf75-78d9-4bcd-b5e0-941a8dd40da8, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30358283146 shimmering.trading.imc.intra
> 2014-09-02 15:19:11,507 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.174:45775, dest: /10.141.0.173:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> 60d4ad85-90d0-42e1-958e-b73ff55a4d2c, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30358543752 sorrows.trading.imc.intra
> 2014-09-02 15:19:11,507 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating shimmering.trading.imc.intra
> 2014-09-02 15:19:11,507 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating silverpine.trading.imc.intra
> 2014-09-02 15:19:11,508 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.155:52119, dest: /10.141.0.174:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> 8bc724e7-0f5c-4347-be32-7b9812286c5c, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30358704352 tirisfal.trading.imc.intra
> 2014-09-02 15:19:11,508 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating sorrows.trading.imc.intra
> 2014-09-02 15:19:11,508 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating tirisfal.trading.imc.intra
> 2014-09-02 15:19:11,509 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.156:43356, dest: /10.141.0.170:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> 6f0e01d4-81a0-4289-b21a-4ca03bc6a191, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30359566398 searing.trading.imc.intra
> 2014-09-02 15:19:11,509 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.170:35201, dest: /10.141.0.155:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> 08b14350-3484-4a0e-8520-8167733c62bd, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30359128561 duskwood.trading.imc.intra
> 2014-09-02 15:19:11,509 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating duskwood.trading.imc.intra
> 2014-09-02 15:19:11,509 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating searing.trading.imc.intra
> 2014-09-02 15:19:11,510 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.157:36874, dest: /10.141.0.156:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> ddcdea62-71e0-4279-a96c-aeea90580249, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30359733153 elwynn.trading.imc.intra
> 2014-09-02 15:19:11,510 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.159:46759, dest: /10.141.0.168:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> c953d2ac-57ba-47a4-ad49-24161a18770b, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30360356635 redridge.trading.imc.intra
> 2014-09-02 15:19:11,510 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.168:37742, dest: /10.141.0.157:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> ab89db9e-c99c-4a6e-9e87-cec30626cef9, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30360119140 eversong.trading.imc.intra
> 2014-09-02 15:19:11,510 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating elwynn.trading.imc.intra
> 2014-09-02 15:19:11,510 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating eversong.trading.imc.intra
> 2014-09-02 15:19:11,510 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating redridge.trading.imc.intra
> 2014-09-02 15:19:11,511 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.87:55341, dest: /10.141.0.159:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> 9ba59389-bce8-4cf0-815c-b96a20699d13, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30328724294 hinterlands.trading.imc.intra
> 2014-09-02 15:19:11,511 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating hinterlands.trading.imc.intra
>
>
>
>
>   From: Stanley Shi <ss...@pivotal.io>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Tuesday, September 2, 2014 at 4:30 AM
>
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Cc: Julien Lehuen <ju...@imc.nl>, Tyler McDougall <
> Tyler.McDougall@imc.nl>
> Subject: Re: Replication factor affecting write performance
>
>   What's the network setup and topology?
> Also, the size of the cluster?
>
>
> On Mon, Sep 1, 2014 at 4:10 PM, Laurens Bronwasser <
> Laurens.Bronwasser@imc.nl> wrote:
>
>>  And now with the right label on the Y-axis.
>>
>>
>>   From: Microsoft Office User <la...@imc.nl>
>> Date: Monday, September 1, 2014 at 9:56 AM
>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Cc: Julien Lehuen <ju...@imc.nl>, Tyler McDougall <
>> Tyler.McDougall@imc.nl>
>> Subject: Replication factor affecting write performance
>>
>>   Hi,
>> We have a setup with two clusters.
>> On cluster shows a very strong degradation when we increase the
>> replication factor.
>> Another cluster shows hardly any degradation with increased replication
>> factor.
>>
>>  Any idea how to find out the bottleneck in the slower cluster?
>>
>>
>>
>> ------------------------------
>>
>> The information in this e-mail is intended only for the person or entity
>> to which it is addressed.
>>
>> It may contain confidential and /or privileged material. If someone other
>> than the intended recipient should receive this e-mail, he / she shall not
>> be entitled to read, disseminate, disclose or duplicate it.
>>
>> If you receive this e-mail unintentionally, please inform us immediately
>> by "reply" and then delete it from your system. Although this information
>> has been compiled with great care, neither IMC Financial Markets & Asset
>> Management nor any of its related entities shall accept any responsibility
>> for any errors, omissions or other inaccuracies in this information or for
>> the consequences thereof, nor shall it be bound in any way by the contents
>> of this e-mail or its attachments. In the event of incomplete or incorrect
>> transmission, please return the e-mail to the sender and permanently delete
>> this message and any attachments.
>>
>> Messages and attachments are scanned for all known viruses. Always scan
>> attachments before opening them.
>>
>
>
>
>  --
>  Regards,
> *Stanley Shi,*
>
>
> ------------------------------
>
> The information in this e-mail is intended only for the person or entity
> to which it is addressed.
>
> It may contain confidential and /or privileged material. If someone other
> than the intended recipient should receive this e-mail, he / she shall not
> be entitled to read, disseminate, disclose or duplicate it.
>
> If you receive this e-mail unintentionally, please inform us immediately
> by "reply" and then delete it from your system. Although this information
> has been compiled with great care, neither IMC Financial Markets & Asset
> Management nor any of its related entities shall accept any responsibility
> for any errors, omissions or other inaccuracies in this information or for
> the consequences thereof, nor shall it be bound in any way by the contents
> of this e-mail or its attachments. In the event of incomplete or incorrect
> transmission, please return the e-mail to the sender and permanently delete
> this message and any attachments.
>
> Messages and attachments are scanned for all known viruses. Always scan
> attachments before opening them.
>

Re: Replication factor affecting write performance

Posted by Dieter De Witte <dr...@gmail.com>.
Isn't it logical that if your replication factor -> number of data nodes
that your performance will drop? Probably if repl factor approaches 18 for
cluster A the write performance will drop too. The more nodes your hdfs can
choose to write the data to the likelier the write will be successful and
efficient (no delays). If you need all your nodes and do many writes then
the writing process will slow down since write2 might have to wait for some
write1s..


2014-09-02 15:47 GMT+02:00 Laurens Bronwasser <La...@imc.nl>:

>  Thanks Stanley for responding.
>
>  The size of the cluster B is 12 data nodes. Cluster A has 18 (slower,
> older) data nodes.
> The topology is that all data nodes and name nodes are under one switch.
> The clusters are not sharing a switch though, the cluster are in different
> data centers.
>
>  Perhaps this helps, below is a trace of the combined logs of all data
> nodes, after writing one block of 500MB with replication factor 12, on
> Cluster B.
> Note that the logs are combined and sorted on millisecond precision,
> certain lines may have happened in a different order:
>
>  2014-09-02 15:18:41,135 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.87:55341 dest: /10.141.0.159:50010 hinterlands.trading.imc.intra
> 2014-09-02 15:18:41,137 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.159:46759 dest: /10.141.0.168:50010 redridge.trading.imc.intra
> 2014-09-02 15:18:41,138 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.168:37742 dest: /10.141.0.157:50010 eversong.trading.imc.intra
> 2014-09-02 15:18:41,139 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.157:36874 dest: /10.141.0.156:50010 elwynn.trading.imc.intra
> 2014-09-02 15:18:41,140 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.156:43356 dest: /10.141.0.170:50010 searing.trading.imc.intra
> 2014-09-02 15:18:41,141 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.170:35201 dest: /10.141.0.155:50010 duskwood.trading.imc.intra
> 2014-09-02 15:18:41,142 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.155:52119 dest: /10.141.0.174:50010 tirisfal.trading.imc.intra
> 2014-09-02 15:18:41,143 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.174:45775 dest: /10.141.0.173:50010 sorrows.trading.imc.intra
> 2014-09-02 15:18:41,144 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.173:55624 dest: /10.141.0.171:50010 shimmering.trading.imc.intra
> 2014-09-02 15:18:41,145 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.171:45532 dest: /10.141.0.172:50010 silverpine.trading.imc.intra
> 2014-09-02 15:18:41,146 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.172:35460 dest: /10.141.0.158:50010 gilneas.trading.imc.intra
> 2014-09-02 15:18:41,147 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /
> 10.141.0.158:59490 dest: /10.141.0.169:50010 scarlet.trading.imc.intra
> 2014-09-02 15:19:11,506 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.158:59490, dest: /10.141.0.169:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> f18f77e8-db9a-4e91-9160-401ae4b2f15b, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30357265647 scarlet.trading.imc.intra
> 2014-09-02 15:19:11,506 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.172:35460, dest: /10.141.0.158:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> 66b3bcb2-36a8-4684-878f-2b90cec9d558, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30357606668 gilneas.trading.imc.intra
> 2014-09-02 15:19:11,506 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating gilneas.trading.imc.intra
> 2014-09-02 15:19:11,506 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=LAST_IN_PIPELINE, downstreams=0:[] terminating
> scarlet.trading.imc.intra
> 2014-09-02 15:19:11,507 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.171:45532, dest: /10.141.0.172:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> 3d587891-c434-4a45-acb1-e41122ab71f9, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30357881405 silverpine.trading.imc.intra
> 2014-09-02 15:19:11,507 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.173:55624, dest: /10.141.0.171:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> 3b0fcf75-78d9-4bcd-b5e0-941a8dd40da8, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30358283146 shimmering.trading.imc.intra
> 2014-09-02 15:19:11,507 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.174:45775, dest: /10.141.0.173:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> 60d4ad85-90d0-42e1-958e-b73ff55a4d2c, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30358543752 sorrows.trading.imc.intra
> 2014-09-02 15:19:11,507 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating shimmering.trading.imc.intra
> 2014-09-02 15:19:11,507 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating silverpine.trading.imc.intra
> 2014-09-02 15:19:11,508 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.155:52119, dest: /10.141.0.174:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> 8bc724e7-0f5c-4347-be32-7b9812286c5c, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30358704352 tirisfal.trading.imc.intra
> 2014-09-02 15:19:11,508 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating sorrows.trading.imc.intra
> 2014-09-02 15:19:11,508 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating tirisfal.trading.imc.intra
> 2014-09-02 15:19:11,509 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.156:43356, dest: /10.141.0.170:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> 6f0e01d4-81a0-4289-b21a-4ca03bc6a191, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30359566398 searing.trading.imc.intra
> 2014-09-02 15:19:11,509 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.170:35201, dest: /10.141.0.155:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> 08b14350-3484-4a0e-8520-8167733c62bd, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30359128561 duskwood.trading.imc.intra
> 2014-09-02 15:19:11,509 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating duskwood.trading.imc.intra
> 2014-09-02 15:19:11,509 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating searing.trading.imc.intra
> 2014-09-02 15:19:11,510 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.157:36874, dest: /10.141.0.156:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> ddcdea62-71e0-4279-a96c-aeea90580249, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30359733153 elwynn.trading.imc.intra
> 2014-09-02 15:19:11,510 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.159:46759, dest: /10.141.0.168:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> c953d2ac-57ba-47a4-ad49-24161a18770b, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30360356635 redridge.trading.imc.intra
> 2014-09-02 15:19:11,510 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.168:37742, dest: /10.141.0.157:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> ab89db9e-c99c-4a6e-9e87-cec30626cef9, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30360119140 eversong.trading.imc.intra
> 2014-09-02 15:19:11,510 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating elwynn.trading.imc.intra
> 2014-09-02 15:19:11,510 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating eversong.trading.imc.intra
> 2014-09-02 15:19:11,510 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating redridge.trading.imc.intra
> 2014-09-02 15:19:11,511 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 10.141.0.87:55341, dest: /10.141.0.159:50010, bytes: 536000000, op:
> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID:
> 9ba59389-bce8-4cf0-815c-b96a20699d13, blockid:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration:
> 30328724294 hinterlands.trading.imc.intra
> 2014-09-02 15:19:11,511 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
> BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587,
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating hinterlands.trading.imc.intra
>
>
>
>
>   From: Stanley Shi <ss...@pivotal.io>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Tuesday, September 2, 2014 at 4:30 AM
>
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Cc: Julien Lehuen <ju...@imc.nl>, Tyler McDougall <
> Tyler.McDougall@imc.nl>
> Subject: Re: Replication factor affecting write performance
>
>   What's the network setup and topology?
> Also, the size of the cluster?
>
>
> On Mon, Sep 1, 2014 at 4:10 PM, Laurens Bronwasser <
> Laurens.Bronwasser@imc.nl> wrote:
>
>>  And now with the right label on the Y-axis.
>>
>>
>>   From: Microsoft Office User <la...@imc.nl>
>> Date: Monday, September 1, 2014 at 9:56 AM
>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Cc: Julien Lehuen <ju...@imc.nl>, Tyler McDougall <
>> Tyler.McDougall@imc.nl>
>> Subject: Replication factor affecting write performance
>>
>>   Hi,
>> We have a setup with two clusters.
>> On cluster shows a very strong degradation when we increase the
>> replication factor.
>> Another cluster shows hardly any degradation with increased replication
>> factor.
>>
>>  Any idea how to find out the bottleneck in the slower cluster?
>>
>>
>>
>> ------------------------------
>>
>> The information in this e-mail is intended only for the person or entity
>> to which it is addressed.
>>
>> It may contain confidential and /or privileged material. If someone other
>> than the intended recipient should receive this e-mail, he / she shall not
>> be entitled to read, disseminate, disclose or duplicate it.
>>
>> If you receive this e-mail unintentionally, please inform us immediately
>> by "reply" and then delete it from your system. Although this information
>> has been compiled with great care, neither IMC Financial Markets & Asset
>> Management nor any of its related entities shall accept any responsibility
>> for any errors, omissions or other inaccuracies in this information or for
>> the consequences thereof, nor shall it be bound in any way by the contents
>> of this e-mail or its attachments. In the event of incomplete or incorrect
>> transmission, please return the e-mail to the sender and permanently delete
>> this message and any attachments.
>>
>> Messages and attachments are scanned for all known viruses. Always scan
>> attachments before opening them.
>>
>
>
>
>  --
>  Regards,
> *Stanley Shi,*
>
>
> ------------------------------
>
> The information in this e-mail is intended only for the person or entity
> to which it is addressed.
>
> It may contain confidential and /or privileged material. If someone other
> than the intended recipient should receive this e-mail, he / she shall not
> be entitled to read, disseminate, disclose or duplicate it.
>
> If you receive this e-mail unintentionally, please inform us immediately
> by "reply" and then delete it from your system. Although this information
> has been compiled with great care, neither IMC Financial Markets & Asset
> Management nor any of its related entities shall accept any responsibility
> for any errors, omissions or other inaccuracies in this information or for
> the consequences thereof, nor shall it be bound in any way by the contents
> of this e-mail or its attachments. In the event of incomplete or incorrect
> transmission, please return the e-mail to the sender and permanently delete
> this message and any attachments.
>
> Messages and attachments are scanned for all known viruses. Always scan
> attachments before opening them.
>

Re: Replication factor affecting write performance

Posted by Laurens Bronwasser <La...@imc.nl>.
Thanks Stanley for responding.

The size of the cluster B is 12 data nodes. Cluster A has 18 (slower, older) data nodes.
The topology is that all data nodes and name nodes are under one switch. The clusters are not sharing a switch though, the cluster are in different data centers.

Perhaps this helps, below is a trace of the combined logs of all data nodes, after writing one block of 500MB with replication factor 12, on Cluster B.
Note that the logs are combined and sorted on millisecond precision, certain lines may have happened in a different order:

2014-09-02 15:18:41,135 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.87:55341 dest: /10.141.0.159:50010 hinterlands.trading.imc.intra
2014-09-02 15:18:41,137 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.159:46759 dest: /10.141.0.168:50010 redridge.trading.imc.intra
2014-09-02 15:18:41,138 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.168:37742 dest: /10.141.0.157:50010 eversong.trading.imc.intra
2014-09-02 15:18:41,139 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.157:36874 dest: /10.141.0.156:50010 elwynn.trading.imc.intra
2014-09-02 15:18:41,140 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.156:43356 dest: /10.141.0.170:50010 searing.trading.imc.intra
2014-09-02 15:18:41,141 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.170:35201 dest: /10.141.0.155:50010 duskwood.trading.imc.intra
2014-09-02 15:18:41,142 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.155:52119 dest: /10.141.0.174:50010 tirisfal.trading.imc.intra
2014-09-02 15:18:41,143 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.174:45775 dest: /10.141.0.173:50010 sorrows.trading.imc.intra
2014-09-02 15:18:41,144 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.173:55624 dest: /10.141.0.171:50010 shimmering.trading.imc.intra
2014-09-02 15:18:41,145 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.171:45532 dest: /10.141.0.172:50010 silverpine.trading.imc.intra
2014-09-02 15:18:41,146 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.172:35460 dest: /10.141.0.158:50010 gilneas.trading.imc.intra
2014-09-02 15:18:41,147 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.158:59490 dest: /10.141.0.169:50010 scarlet.trading.imc.intra
2014-09-02 15:19:11,506 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.158:59490, dest: /10.141.0.169:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: f18f77e8-db9a-4e91-9160-401ae4b2f15b, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30357265647 scarlet.trading.imc.intra
2014-09-02 15:19:11,506 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.172:35460, dest: /10.141.0.158:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 66b3bcb2-36a8-4684-878f-2b90cec9d558, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30357606668 gilneas.trading.imc.intra
2014-09-02 15:19:11,506 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating gilneas.trading.imc.intra
2014-09-02 15:19:11,506 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=LAST_IN_PIPELINE, downstreams=0:[] terminating scarlet.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.171:45532, dest: /10.141.0.172:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 3d587891-c434-4a45-acb1-e41122ab71f9, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30357881405 silverpine.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.173:55624, dest: /10.141.0.171:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 3b0fcf75-78d9-4bcd-b5e0-941a8dd40da8, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30358283146 shimmering.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.174:45775, dest: /10.141.0.173:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 60d4ad85-90d0-42e1-958e-b73ff55a4d2c, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30358543752 sorrows.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating shimmering.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating silverpine.trading.imc.intra
2014-09-02 15:19:11,508 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.155:52119, dest: /10.141.0.174:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 8bc724e7-0f5c-4347-be32-7b9812286c5c, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30358704352 tirisfal.trading.imc.intra
2014-09-02 15:19:11,508 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating sorrows.trading.imc.intra
2014-09-02 15:19:11,508 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating tirisfal.trading.imc.intra
2014-09-02 15:19:11,509 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.156:43356, dest: /10.141.0.170:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 6f0e01d4-81a0-4289-b21a-4ca03bc6a191, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30359566398 searing.trading.imc.intra
2014-09-02 15:19:11,509 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.170:35201, dest: /10.141.0.155:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 08b14350-3484-4a0e-8520-8167733c62bd, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30359128561 duskwood.trading.imc.intra
2014-09-02 15:19:11,509 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating duskwood.trading.imc.intra
2014-09-02 15:19:11,509 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating searing.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.157:36874, dest: /10.141.0.156:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: ddcdea62-71e0-4279-a96c-aeea90580249, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30359733153 elwynn.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.159:46759, dest: /10.141.0.168:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: c953d2ac-57ba-47a4-ad49-24161a18770b, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30360356635 redridge.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.168:37742, dest: /10.141.0.157:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: ab89db9e-c99c-4a6e-9e87-cec30626cef9, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30360119140 eversong.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating elwynn.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating eversong.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating redridge.trading.imc.intra
2014-09-02 15:19:11,511 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.87:55341, dest: /10.141.0.159:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 9ba59389-bce8-4cf0-815c-b96a20699d13, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30328724294 hinterlands.trading.imc.intra
2014-09-02 15:19:11,511 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating hinterlands.trading.imc.intra




From: Stanley Shi <ss...@pivotal.io>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Tuesday, September 2, 2014 at 4:30 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Cc: Julien Lehuen <ju...@imc.nl>>, Tyler McDougall <Ty...@imc.nl>>
Subject: Re: Replication factor affecting write performance

What's the network setup and topology?
Also, the size of the cluster?


On Mon, Sep 1, 2014 at 4:10 PM, Laurens Bronwasser <La...@imc.nl>> wrote:
And now with the right label on the Y-axis.

[cid:7D64F387-C6F1-4B37-9894-0166EC949EF9]

From: Microsoft Office User <la...@imc.nl>>
Date: Monday, September 1, 2014 at 9:56 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Cc: Julien Lehuen <ju...@imc.nl>>, Tyler McDougall <Ty...@imc.nl>>
Subject: Replication factor affecting write performance

Hi,
We have a setup with two clusters.
On cluster shows a very strong degradation when we increase the replication factor.
Another cluster shows hardly any degradation with increased replication factor.

Any idea how to find out the bottleneck in the slower cluster?



________________________________

The information in this e-mail is intended only for the person or entity to which it is addressed.

It may contain confidential and /or privileged material. If someone other than the intended recipient should receive this e-mail, he / she shall not be entitled to read, disseminate, disclose or duplicate it.

If you receive this e-mail unintentionally, please inform us immediately by "reply" and then delete it from your system. Although this information has been compiled with great care, neither IMC Financial Markets & Asset Management nor any of its related entities shall accept any responsibility for any errors, omissions or other inaccuracies in this information or for the consequences thereof, nor shall it be bound in any way by the contents of this e-mail or its attachments. In the event of incomplete or incorrect transmission, please return the e-mail to the sender and permanently delete this message and any attachments.

Messages and attachments are scanned for all known viruses. Always scan attachments before opening them.



--
Regards,
Stanley Shi,
[http://www.gopivotal.com/files/media/logos/pivotal-logo-email-signature.png]

________________________________

The information in this e-mail is intended only for the person or entity to which it is addressed.

It may contain confidential and /or privileged material. If someone other than the intended recipient should receive this e-mail, he / she shall not be entitled to read, disseminate, disclose or duplicate it.

If you receive this e-mail unintentionally, please inform us immediately by "reply" and then delete it from your system. Although this information has been compiled with great care, neither IMC Financial Markets & Asset Management nor any of its related entities shall accept any responsibility for any errors, omissions or other inaccuracies in this information or for the consequences thereof, nor shall it be bound in any way by the contents of this e-mail or its attachments. In the event of incomplete or incorrect transmission, please return the e-mail to the sender and permanently delete this message and any attachments.

Messages and attachments are scanned for all known viruses. Always scan attachments before opening them.

Re: Replication factor affecting write performance

Posted by Laurens Bronwasser <La...@imc.nl>.
Thanks Stanley for responding.

The size of the cluster B is 12 data nodes. Cluster A has 18 (slower, older) data nodes.
The topology is that all data nodes and name nodes are under one switch. The clusters are not sharing a switch though, the cluster are in different data centers.

Perhaps this helps, below is a trace of the combined logs of all data nodes, after writing one block of 500MB with replication factor 12, on Cluster B.
Note that the logs are combined and sorted on millisecond precision, certain lines may have happened in a different order:

2014-09-02 15:18:41,135 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.87:55341 dest: /10.141.0.159:50010 hinterlands.trading.imc.intra
2014-09-02 15:18:41,137 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.159:46759 dest: /10.141.0.168:50010 redridge.trading.imc.intra
2014-09-02 15:18:41,138 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.168:37742 dest: /10.141.0.157:50010 eversong.trading.imc.intra
2014-09-02 15:18:41,139 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.157:36874 dest: /10.141.0.156:50010 elwynn.trading.imc.intra
2014-09-02 15:18:41,140 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.156:43356 dest: /10.141.0.170:50010 searing.trading.imc.intra
2014-09-02 15:18:41,141 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.170:35201 dest: /10.141.0.155:50010 duskwood.trading.imc.intra
2014-09-02 15:18:41,142 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.155:52119 dest: /10.141.0.174:50010 tirisfal.trading.imc.intra
2014-09-02 15:18:41,143 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.174:45775 dest: /10.141.0.173:50010 sorrows.trading.imc.intra
2014-09-02 15:18:41,144 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.173:55624 dest: /10.141.0.171:50010 shimmering.trading.imc.intra
2014-09-02 15:18:41,145 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.171:45532 dest: /10.141.0.172:50010 silverpine.trading.imc.intra
2014-09-02 15:18:41,146 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.172:35460 dest: /10.141.0.158:50010 gilneas.trading.imc.intra
2014-09-02 15:18:41,147 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.158:59490 dest: /10.141.0.169:50010 scarlet.trading.imc.intra
2014-09-02 15:19:11,506 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.158:59490, dest: /10.141.0.169:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: f18f77e8-db9a-4e91-9160-401ae4b2f15b, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30357265647 scarlet.trading.imc.intra
2014-09-02 15:19:11,506 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.172:35460, dest: /10.141.0.158:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 66b3bcb2-36a8-4684-878f-2b90cec9d558, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30357606668 gilneas.trading.imc.intra
2014-09-02 15:19:11,506 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating gilneas.trading.imc.intra
2014-09-02 15:19:11,506 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=LAST_IN_PIPELINE, downstreams=0:[] terminating scarlet.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.171:45532, dest: /10.141.0.172:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 3d587891-c434-4a45-acb1-e41122ab71f9, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30357881405 silverpine.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.173:55624, dest: /10.141.0.171:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 3b0fcf75-78d9-4bcd-b5e0-941a8dd40da8, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30358283146 shimmering.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.174:45775, dest: /10.141.0.173:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 60d4ad85-90d0-42e1-958e-b73ff55a4d2c, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30358543752 sorrows.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating shimmering.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating silverpine.trading.imc.intra
2014-09-02 15:19:11,508 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.155:52119, dest: /10.141.0.174:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 8bc724e7-0f5c-4347-be32-7b9812286c5c, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30358704352 tirisfal.trading.imc.intra
2014-09-02 15:19:11,508 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating sorrows.trading.imc.intra
2014-09-02 15:19:11,508 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating tirisfal.trading.imc.intra
2014-09-02 15:19:11,509 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.156:43356, dest: /10.141.0.170:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 6f0e01d4-81a0-4289-b21a-4ca03bc6a191, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30359566398 searing.trading.imc.intra
2014-09-02 15:19:11,509 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.170:35201, dest: /10.141.0.155:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 08b14350-3484-4a0e-8520-8167733c62bd, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30359128561 duskwood.trading.imc.intra
2014-09-02 15:19:11,509 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating duskwood.trading.imc.intra
2014-09-02 15:19:11,509 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating searing.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.157:36874, dest: /10.141.0.156:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: ddcdea62-71e0-4279-a96c-aeea90580249, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30359733153 elwynn.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.159:46759, dest: /10.141.0.168:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: c953d2ac-57ba-47a4-ad49-24161a18770b, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30360356635 redridge.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.168:37742, dest: /10.141.0.157:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: ab89db9e-c99c-4a6e-9e87-cec30626cef9, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30360119140 eversong.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating elwynn.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating eversong.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating redridge.trading.imc.intra
2014-09-02 15:19:11,511 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.87:55341, dest: /10.141.0.159:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 9ba59389-bce8-4cf0-815c-b96a20699d13, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30328724294 hinterlands.trading.imc.intra
2014-09-02 15:19:11,511 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating hinterlands.trading.imc.intra




From: Stanley Shi <ss...@pivotal.io>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Tuesday, September 2, 2014 at 4:30 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Cc: Julien Lehuen <ju...@imc.nl>>, Tyler McDougall <Ty...@imc.nl>>
Subject: Re: Replication factor affecting write performance

What's the network setup and topology?
Also, the size of the cluster?


On Mon, Sep 1, 2014 at 4:10 PM, Laurens Bronwasser <La...@imc.nl>> wrote:
And now with the right label on the Y-axis.

[cid:7D64F387-C6F1-4B37-9894-0166EC949EF9]

From: Microsoft Office User <la...@imc.nl>>
Date: Monday, September 1, 2014 at 9:56 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Cc: Julien Lehuen <ju...@imc.nl>>, Tyler McDougall <Ty...@imc.nl>>
Subject: Replication factor affecting write performance

Hi,
We have a setup with two clusters.
On cluster shows a very strong degradation when we increase the replication factor.
Another cluster shows hardly any degradation with increased replication factor.

Any idea how to find out the bottleneck in the slower cluster?



________________________________

The information in this e-mail is intended only for the person or entity to which it is addressed.

It may contain confidential and /or privileged material. If someone other than the intended recipient should receive this e-mail, he / she shall not be entitled to read, disseminate, disclose or duplicate it.

If you receive this e-mail unintentionally, please inform us immediately by "reply" and then delete it from your system. Although this information has been compiled with great care, neither IMC Financial Markets & Asset Management nor any of its related entities shall accept any responsibility for any errors, omissions or other inaccuracies in this information or for the consequences thereof, nor shall it be bound in any way by the contents of this e-mail or its attachments. In the event of incomplete or incorrect transmission, please return the e-mail to the sender and permanently delete this message and any attachments.

Messages and attachments are scanned for all known viruses. Always scan attachments before opening them.



--
Regards,
Stanley Shi,
[http://www.gopivotal.com/files/media/logos/pivotal-logo-email-signature.png]

________________________________

The information in this e-mail is intended only for the person or entity to which it is addressed.

It may contain confidential and /or privileged material. If someone other than the intended recipient should receive this e-mail, he / she shall not be entitled to read, disseminate, disclose or duplicate it.

If you receive this e-mail unintentionally, please inform us immediately by "reply" and then delete it from your system. Although this information has been compiled with great care, neither IMC Financial Markets & Asset Management nor any of its related entities shall accept any responsibility for any errors, omissions or other inaccuracies in this information or for the consequences thereof, nor shall it be bound in any way by the contents of this e-mail or its attachments. In the event of incomplete or incorrect transmission, please return the e-mail to the sender and permanently delete this message and any attachments.

Messages and attachments are scanned for all known viruses. Always scan attachments before opening them.

Re: Replication factor affecting write performance

Posted by Laurens Bronwasser <La...@imc.nl>.
Thanks Stanley for responding.

The size of the cluster B is 12 data nodes. Cluster A has 18 (slower, older) data nodes.
The topology is that all data nodes and name nodes are under one switch. The clusters are not sharing a switch though, the cluster are in different data centers.

Perhaps this helps, below is a trace of the combined logs of all data nodes, after writing one block of 500MB with replication factor 12, on Cluster B.
Note that the logs are combined and sorted on millisecond precision, certain lines may have happened in a different order:

2014-09-02 15:18:41,135 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.87:55341 dest: /10.141.0.159:50010 hinterlands.trading.imc.intra
2014-09-02 15:18:41,137 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.159:46759 dest: /10.141.0.168:50010 redridge.trading.imc.intra
2014-09-02 15:18:41,138 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.168:37742 dest: /10.141.0.157:50010 eversong.trading.imc.intra
2014-09-02 15:18:41,139 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.157:36874 dest: /10.141.0.156:50010 elwynn.trading.imc.intra
2014-09-02 15:18:41,140 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.156:43356 dest: /10.141.0.170:50010 searing.trading.imc.intra
2014-09-02 15:18:41,141 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.170:35201 dest: /10.141.0.155:50010 duskwood.trading.imc.intra
2014-09-02 15:18:41,142 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.155:52119 dest: /10.141.0.174:50010 tirisfal.trading.imc.intra
2014-09-02 15:18:41,143 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.174:45775 dest: /10.141.0.173:50010 sorrows.trading.imc.intra
2014-09-02 15:18:41,144 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.173:55624 dest: /10.141.0.171:50010 shimmering.trading.imc.intra
2014-09-02 15:18:41,145 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.171:45532 dest: /10.141.0.172:50010 silverpine.trading.imc.intra
2014-09-02 15:18:41,146 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.172:35460 dest: /10.141.0.158:50010 gilneas.trading.imc.intra
2014-09-02 15:18:41,147 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.158:59490 dest: /10.141.0.169:50010 scarlet.trading.imc.intra
2014-09-02 15:19:11,506 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.158:59490, dest: /10.141.0.169:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: f18f77e8-db9a-4e91-9160-401ae4b2f15b, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30357265647 scarlet.trading.imc.intra
2014-09-02 15:19:11,506 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.172:35460, dest: /10.141.0.158:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 66b3bcb2-36a8-4684-878f-2b90cec9d558, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30357606668 gilneas.trading.imc.intra
2014-09-02 15:19:11,506 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating gilneas.trading.imc.intra
2014-09-02 15:19:11,506 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=LAST_IN_PIPELINE, downstreams=0:[] terminating scarlet.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.171:45532, dest: /10.141.0.172:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 3d587891-c434-4a45-acb1-e41122ab71f9, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30357881405 silverpine.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.173:55624, dest: /10.141.0.171:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 3b0fcf75-78d9-4bcd-b5e0-941a8dd40da8, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30358283146 shimmering.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.174:45775, dest: /10.141.0.173:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 60d4ad85-90d0-42e1-958e-b73ff55a4d2c, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30358543752 sorrows.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating shimmering.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating silverpine.trading.imc.intra
2014-09-02 15:19:11,508 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.155:52119, dest: /10.141.0.174:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 8bc724e7-0f5c-4347-be32-7b9812286c5c, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30358704352 tirisfal.trading.imc.intra
2014-09-02 15:19:11,508 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating sorrows.trading.imc.intra
2014-09-02 15:19:11,508 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating tirisfal.trading.imc.intra
2014-09-02 15:19:11,509 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.156:43356, dest: /10.141.0.170:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 6f0e01d4-81a0-4289-b21a-4ca03bc6a191, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30359566398 searing.trading.imc.intra
2014-09-02 15:19:11,509 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.170:35201, dest: /10.141.0.155:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 08b14350-3484-4a0e-8520-8167733c62bd, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30359128561 duskwood.trading.imc.intra
2014-09-02 15:19:11,509 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating duskwood.trading.imc.intra
2014-09-02 15:19:11,509 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating searing.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.157:36874, dest: /10.141.0.156:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: ddcdea62-71e0-4279-a96c-aeea90580249, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30359733153 elwynn.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.159:46759, dest: /10.141.0.168:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: c953d2ac-57ba-47a4-ad49-24161a18770b, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30360356635 redridge.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.168:37742, dest: /10.141.0.157:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: ab89db9e-c99c-4a6e-9e87-cec30626cef9, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30360119140 eversong.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating elwynn.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating eversong.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating redridge.trading.imc.intra
2014-09-02 15:19:11,511 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.87:55341, dest: /10.141.0.159:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 9ba59389-bce8-4cf0-815c-b96a20699d13, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30328724294 hinterlands.trading.imc.intra
2014-09-02 15:19:11,511 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating hinterlands.trading.imc.intra




From: Stanley Shi <ss...@pivotal.io>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Tuesday, September 2, 2014 at 4:30 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Cc: Julien Lehuen <ju...@imc.nl>>, Tyler McDougall <Ty...@imc.nl>>
Subject: Re: Replication factor affecting write performance

What's the network setup and topology?
Also, the size of the cluster?


On Mon, Sep 1, 2014 at 4:10 PM, Laurens Bronwasser <La...@imc.nl>> wrote:
And now with the right label on the Y-axis.

[cid:7D64F387-C6F1-4B37-9894-0166EC949EF9]

From: Microsoft Office User <la...@imc.nl>>
Date: Monday, September 1, 2014 at 9:56 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Cc: Julien Lehuen <ju...@imc.nl>>, Tyler McDougall <Ty...@imc.nl>>
Subject: Replication factor affecting write performance

Hi,
We have a setup with two clusters.
On cluster shows a very strong degradation when we increase the replication factor.
Another cluster shows hardly any degradation with increased replication factor.

Any idea how to find out the bottleneck in the slower cluster?



________________________________

The information in this e-mail is intended only for the person or entity to which it is addressed.

It may contain confidential and /or privileged material. If someone other than the intended recipient should receive this e-mail, he / she shall not be entitled to read, disseminate, disclose or duplicate it.

If you receive this e-mail unintentionally, please inform us immediately by "reply" and then delete it from your system. Although this information has been compiled with great care, neither IMC Financial Markets & Asset Management nor any of its related entities shall accept any responsibility for any errors, omissions or other inaccuracies in this information or for the consequences thereof, nor shall it be bound in any way by the contents of this e-mail or its attachments. In the event of incomplete or incorrect transmission, please return the e-mail to the sender and permanently delete this message and any attachments.

Messages and attachments are scanned for all known viruses. Always scan attachments before opening them.



--
Regards,
Stanley Shi,
[http://www.gopivotal.com/files/media/logos/pivotal-logo-email-signature.png]

________________________________

The information in this e-mail is intended only for the person or entity to which it is addressed.

It may contain confidential and /or privileged material. If someone other than the intended recipient should receive this e-mail, he / she shall not be entitled to read, disseminate, disclose or duplicate it.

If you receive this e-mail unintentionally, please inform us immediately by "reply" and then delete it from your system. Although this information has been compiled with great care, neither IMC Financial Markets & Asset Management nor any of its related entities shall accept any responsibility for any errors, omissions or other inaccuracies in this information or for the consequences thereof, nor shall it be bound in any way by the contents of this e-mail or its attachments. In the event of incomplete or incorrect transmission, please return the e-mail to the sender and permanently delete this message and any attachments.

Messages and attachments are scanned for all known viruses. Always scan attachments before opening them.

Re: Replication factor affecting write performance

Posted by Laurens Bronwasser <La...@imc.nl>.
Thanks Stanley for responding.

The size of the cluster B is 12 data nodes. Cluster A has 18 (slower, older) data nodes.
The topology is that all data nodes and name nodes are under one switch. The clusters are not sharing a switch though, the cluster are in different data centers.

Perhaps this helps, below is a trace of the combined logs of all data nodes, after writing one block of 500MB with replication factor 12, on Cluster B.
Note that the logs are combined and sorted on millisecond precision, certain lines may have happened in a different order:

2014-09-02 15:18:41,135 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.87:55341 dest: /10.141.0.159:50010 hinterlands.trading.imc.intra
2014-09-02 15:18:41,137 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.159:46759 dest: /10.141.0.168:50010 redridge.trading.imc.intra
2014-09-02 15:18:41,138 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.168:37742 dest: /10.141.0.157:50010 eversong.trading.imc.intra
2014-09-02 15:18:41,139 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.157:36874 dest: /10.141.0.156:50010 elwynn.trading.imc.intra
2014-09-02 15:18:41,140 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.156:43356 dest: /10.141.0.170:50010 searing.trading.imc.intra
2014-09-02 15:18:41,141 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.170:35201 dest: /10.141.0.155:50010 duskwood.trading.imc.intra
2014-09-02 15:18:41,142 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.155:52119 dest: /10.141.0.174:50010 tirisfal.trading.imc.intra
2014-09-02 15:18:41,143 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.174:45775 dest: /10.141.0.173:50010 sorrows.trading.imc.intra
2014-09-02 15:18:41,144 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.173:55624 dest: /10.141.0.171:50010 shimmering.trading.imc.intra
2014-09-02 15:18:41,145 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.171:45532 dest: /10.141.0.172:50010 silverpine.trading.imc.intra
2014-09-02 15:18:41,146 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.172:35460 dest: /10.141.0.158:50010 gilneas.trading.imc.intra
2014-09-02 15:18:41,147 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: /10.141.0.158:59490 dest: /10.141.0.169:50010 scarlet.trading.imc.intra
2014-09-02 15:19:11,506 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.158:59490, dest: /10.141.0.169:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: f18f77e8-db9a-4e91-9160-401ae4b2f15b, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30357265647 scarlet.trading.imc.intra
2014-09-02 15:19:11,506 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.172:35460, dest: /10.141.0.158:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 66b3bcb2-36a8-4684-878f-2b90cec9d558, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30357606668 gilneas.trading.imc.intra
2014-09-02 15:19:11,506 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating gilneas.trading.imc.intra
2014-09-02 15:19:11,506 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=LAST_IN_PIPELINE, downstreams=0:[] terminating scarlet.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.171:45532, dest: /10.141.0.172:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 3d587891-c434-4a45-acb1-e41122ab71f9, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30357881405 silverpine.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.173:55624, dest: /10.141.0.171:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 3b0fcf75-78d9-4bcd-b5e0-941a8dd40da8, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30358283146 shimmering.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.174:45775, dest: /10.141.0.173:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 60d4ad85-90d0-42e1-958e-b73ff55a4d2c, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30358543752 sorrows.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating shimmering.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating silverpine.trading.imc.intra
2014-09-02 15:19:11,508 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.155:52119, dest: /10.141.0.174:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 8bc724e7-0f5c-4347-be32-7b9812286c5c, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30358704352 tirisfal.trading.imc.intra
2014-09-02 15:19:11,508 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating sorrows.trading.imc.intra
2014-09-02 15:19:11,508 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating tirisfal.trading.imc.intra
2014-09-02 15:19:11,509 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.156:43356, dest: /10.141.0.170:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 6f0e01d4-81a0-4289-b21a-4ca03bc6a191, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30359566398 searing.trading.imc.intra
2014-09-02 15:19:11,509 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.170:35201, dest: /10.141.0.155:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 08b14350-3484-4a0e-8520-8167733c62bd, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30359128561 duskwood.trading.imc.intra
2014-09-02 15:19:11,509 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating duskwood.trading.imc.intra
2014-09-02 15:19:11,509 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating searing.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.157:36874, dest: /10.141.0.156:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: ddcdea62-71e0-4279-a96c-aeea90580249, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30359733153 elwynn.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.159:46759, dest: /10.141.0.168:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: c953d2ac-57ba-47a4-ad49-24161a18770b, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30360356635 redridge.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.168:37742, dest: /10.141.0.157:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: ab89db9e-c99c-4a6e-9e87-cec30626cef9, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30360119140 eversong.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating elwynn.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating eversong.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating redridge.trading.imc.intra
2014-09-02 15:19:11,511 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.141.0.87:55341, dest: /10.141.0.159:50010, bytes: 536000000, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 9ba59389-bce8-4cf0-815c-b96a20699d13, blockid: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 30328724294 hinterlands.trading.imc.intra
2014-09-02 15:19:11,511 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, type=HAS_DOWNSTREAM_IN_PIPELINE terminating hinterlands.trading.imc.intra




From: Stanley Shi <ss...@pivotal.io>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Tuesday, September 2, 2014 at 4:30 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Cc: Julien Lehuen <ju...@imc.nl>>, Tyler McDougall <Ty...@imc.nl>>
Subject: Re: Replication factor affecting write performance

What's the network setup and topology?
Also, the size of the cluster?


On Mon, Sep 1, 2014 at 4:10 PM, Laurens Bronwasser <La...@imc.nl>> wrote:
And now with the right label on the Y-axis.

[cid:7D64F387-C6F1-4B37-9894-0166EC949EF9]

From: Microsoft Office User <la...@imc.nl>>
Date: Monday, September 1, 2014 at 9:56 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Cc: Julien Lehuen <ju...@imc.nl>>, Tyler McDougall <Ty...@imc.nl>>
Subject: Replication factor affecting write performance

Hi,
We have a setup with two clusters.
On cluster shows a very strong degradation when we increase the replication factor.
Another cluster shows hardly any degradation with increased replication factor.

Any idea how to find out the bottleneck in the slower cluster?



________________________________

The information in this e-mail is intended only for the person or entity to which it is addressed.

It may contain confidential and /or privileged material. If someone other than the intended recipient should receive this e-mail, he / she shall not be entitled to read, disseminate, disclose or duplicate it.

If you receive this e-mail unintentionally, please inform us immediately by "reply" and then delete it from your system. Although this information has been compiled with great care, neither IMC Financial Markets & Asset Management nor any of its related entities shall accept any responsibility for any errors, omissions or other inaccuracies in this information or for the consequences thereof, nor shall it be bound in any way by the contents of this e-mail or its attachments. In the event of incomplete or incorrect transmission, please return the e-mail to the sender and permanently delete this message and any attachments.

Messages and attachments are scanned for all known viruses. Always scan attachments before opening them.



--
Regards,
Stanley Shi,
[http://www.gopivotal.com/files/media/logos/pivotal-logo-email-signature.png]

________________________________

The information in this e-mail is intended only for the person or entity to which it is addressed.

It may contain confidential and /or privileged material. If someone other than the intended recipient should receive this e-mail, he / she shall not be entitled to read, disseminate, disclose or duplicate it.

If you receive this e-mail unintentionally, please inform us immediately by "reply" and then delete it from your system. Although this information has been compiled with great care, neither IMC Financial Markets & Asset Management nor any of its related entities shall accept any responsibility for any errors, omissions or other inaccuracies in this information or for the consequences thereof, nor shall it be bound in any way by the contents of this e-mail or its attachments. In the event of incomplete or incorrect transmission, please return the e-mail to the sender and permanently delete this message and any attachments.

Messages and attachments are scanned for all known viruses. Always scan attachments before opening them.

RE: Replication factor affecting write performance

Posted by mike Zarrin <mi...@unitedrmr.com>.
Unsubscribe

 

 

From: Stanley Shi [mailto:sshi@pivotal.io] 
Sent: Monday, September 01, 2014 7:31 PM
To: user@hadoop.apache.org
Cc: Julien Lehuen; Tyler McDougall
Subject: Re: Replication factor affecting write performance

 

What's the network setup and topology?

Also, the size of the cluster?

 

On Mon, Sep 1, 2014 at 4:10 PM, Laurens Bronwasser <La...@imc.nl> wrote:

And now with the right label on the Y-axis.

 



 

From: Microsoft Office User <la...@imc.nl>
Date: Monday, September 1, 2014 at 9:56 AM
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Cc: Julien Lehuen <ju...@imc.nl>, Tyler McDougall <Ty...@imc.nl>
Subject: Replication factor affecting write performance

 

Hi,

We have a setup with two clusters. 

On cluster shows a very strong degradation when we increase the replication factor.

Another cluster shows hardly any degradation with increased replication factor.

 

Any idea how to find out the bottleneck in the slower cluster?

 

 

 

  _____  


The information in this e-mail is intended only for the person or entity to which it is addressed.

It may contain confidential and /or privileged material. If someone other than the intended recipient should receive this e-mail, he / she shall not be entitled to read, disseminate, disclose or duplicate it.

If you receive this e-mail unintentionally, please inform us immediately by "reply" and then delete it from your system. Although this information has been compiled with great care, neither IMC Financial Markets & Asset Management nor any of its related entities shall accept any responsibility for any errors, omissions or other inaccuracies in this information or for the consequences thereof, nor shall it be bound in any way by the contents of this e-mail or its attachments. In the event of incomplete or incorrect transmission, please return the e-mail to the sender and permanently delete this message and any attachments.

Messages and attachments are scanned for all known viruses. Always scan attachments before opening them.





 

-- 

Regards,

Stanley Shi,

  <http://www.gopivotal.com/files/media/logos/pivotal-logo-email-signature.png> 


RE: Replication factor affecting write performance

Posted by mike Zarrin <mi...@unitedrmr.com>.
Unsubscribe

 

 

From: Stanley Shi [mailto:sshi@pivotal.io] 
Sent: Monday, September 01, 2014 7:31 PM
To: user@hadoop.apache.org
Cc: Julien Lehuen; Tyler McDougall
Subject: Re: Replication factor affecting write performance

 

What's the network setup and topology?

Also, the size of the cluster?

 

On Mon, Sep 1, 2014 at 4:10 PM, Laurens Bronwasser <La...@imc.nl> wrote:

And now with the right label on the Y-axis.

 



 

From: Microsoft Office User <la...@imc.nl>
Date: Monday, September 1, 2014 at 9:56 AM
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Cc: Julien Lehuen <ju...@imc.nl>, Tyler McDougall <Ty...@imc.nl>
Subject: Replication factor affecting write performance

 

Hi,

We have a setup with two clusters. 

On cluster shows a very strong degradation when we increase the replication factor.

Another cluster shows hardly any degradation with increased replication factor.

 

Any idea how to find out the bottleneck in the slower cluster?

 

 

 

  _____  


The information in this e-mail is intended only for the person or entity to which it is addressed.

It may contain confidential and /or privileged material. If someone other than the intended recipient should receive this e-mail, he / she shall not be entitled to read, disseminate, disclose or duplicate it.

If you receive this e-mail unintentionally, please inform us immediately by "reply" and then delete it from your system. Although this information has been compiled with great care, neither IMC Financial Markets & Asset Management nor any of its related entities shall accept any responsibility for any errors, omissions or other inaccuracies in this information or for the consequences thereof, nor shall it be bound in any way by the contents of this e-mail or its attachments. In the event of incomplete or incorrect transmission, please return the e-mail to the sender and permanently delete this message and any attachments.

Messages and attachments are scanned for all known viruses. Always scan attachments before opening them.





 

-- 

Regards,

Stanley Shi,

  <http://www.gopivotal.com/files/media/logos/pivotal-logo-email-signature.png> 


Re: Replication factor affecting write performance

Posted by Stanley Shi <ss...@pivotal.io>.
What's the network setup and topology?
Also, the size of the cluster?


On Mon, Sep 1, 2014 at 4:10 PM, Laurens Bronwasser <
Laurens.Bronwasser@imc.nl> wrote:

>  And now with the right label on the Y-axis.
>
>
>   From: Microsoft Office User <la...@imc.nl>
> Date: Monday, September 1, 2014 at 9:56 AM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Cc: Julien Lehuen <ju...@imc.nl>, Tyler McDougall <
> Tyler.McDougall@imc.nl>
> Subject: Replication factor affecting write performance
>
>   Hi,
> We have a setup with two clusters.
> On cluster shows a very strong degradation when we increase the
> replication factor.
> Another cluster shows hardly any degradation with increased replication
> factor.
>
>  Any idea how to find out the bottleneck in the slower cluster?
>
>
>
> ------------------------------
>
> The information in this e-mail is intended only for the person or entity
> to which it is addressed.
>
> It may contain confidential and /or privileged material. If someone other
> than the intended recipient should receive this e-mail, he / she shall not
> be entitled to read, disseminate, disclose or duplicate it.
>
> If you receive this e-mail unintentionally, please inform us immediately
> by "reply" and then delete it from your system. Although this information
> has been compiled with great care, neither IMC Financial Markets & Asset
> Management nor any of its related entities shall accept any responsibility
> for any errors, omissions or other inaccuracies in this information or for
> the consequences thereof, nor shall it be bound in any way by the contents
> of this e-mail or its attachments. In the event of incomplete or incorrect
> transmission, please return the e-mail to the sender and permanently delete
> this message and any attachments.
>
> Messages and attachments are scanned for all known viruses. Always scan
> attachments before opening them.
>



-- 
Regards,
*Stanley Shi,*

Re: Replication factor affecting write performance

Posted by Stanley Shi <ss...@pivotal.io>.
What's the network setup and topology?
Also, the size of the cluster?


On Mon, Sep 1, 2014 at 4:10 PM, Laurens Bronwasser <
Laurens.Bronwasser@imc.nl> wrote:

>  And now with the right label on the Y-axis.
>
>
>   From: Microsoft Office User <la...@imc.nl>
> Date: Monday, September 1, 2014 at 9:56 AM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Cc: Julien Lehuen <ju...@imc.nl>, Tyler McDougall <
> Tyler.McDougall@imc.nl>
> Subject: Replication factor affecting write performance
>
>   Hi,
> We have a setup with two clusters.
> On cluster shows a very strong degradation when we increase the
> replication factor.
> Another cluster shows hardly any degradation with increased replication
> factor.
>
>  Any idea how to find out the bottleneck in the slower cluster?
>
>
>
> ------------------------------
>
> The information in this e-mail is intended only for the person or entity
> to which it is addressed.
>
> It may contain confidential and /or privileged material. If someone other
> than the intended recipient should receive this e-mail, he / she shall not
> be entitled to read, disseminate, disclose or duplicate it.
>
> If you receive this e-mail unintentionally, please inform us immediately
> by "reply" and then delete it from your system. Although this information
> has been compiled with great care, neither IMC Financial Markets & Asset
> Management nor any of its related entities shall accept any responsibility
> for any errors, omissions or other inaccuracies in this information or for
> the consequences thereof, nor shall it be bound in any way by the contents
> of this e-mail or its attachments. In the event of incomplete or incorrect
> transmission, please return the e-mail to the sender and permanently delete
> this message and any attachments.
>
> Messages and attachments are scanned for all known viruses. Always scan
> attachments before opening them.
>



-- 
Regards,
*Stanley Shi,*

Re: Replication factor affecting write performance

Posted by Stanley Shi <ss...@pivotal.io>.
What's the network setup and topology?
Also, the size of the cluster?


On Mon, Sep 1, 2014 at 4:10 PM, Laurens Bronwasser <
Laurens.Bronwasser@imc.nl> wrote:

>  And now with the right label on the Y-axis.
>
>
>   From: Microsoft Office User <la...@imc.nl>
> Date: Monday, September 1, 2014 at 9:56 AM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Cc: Julien Lehuen <ju...@imc.nl>, Tyler McDougall <
> Tyler.McDougall@imc.nl>
> Subject: Replication factor affecting write performance
>
>   Hi,
> We have a setup with two clusters.
> On cluster shows a very strong degradation when we increase the
> replication factor.
> Another cluster shows hardly any degradation with increased replication
> factor.
>
>  Any idea how to find out the bottleneck in the slower cluster?
>
>
>
> ------------------------------
>
> The information in this e-mail is intended only for the person or entity
> to which it is addressed.
>
> It may contain confidential and /or privileged material. If someone other
> than the intended recipient should receive this e-mail, he / she shall not
> be entitled to read, disseminate, disclose or duplicate it.
>
> If you receive this e-mail unintentionally, please inform us immediately
> by "reply" and then delete it from your system. Although this information
> has been compiled with great care, neither IMC Financial Markets & Asset
> Management nor any of its related entities shall accept any responsibility
> for any errors, omissions or other inaccuracies in this information or for
> the consequences thereof, nor shall it be bound in any way by the contents
> of this e-mail or its attachments. In the event of incomplete or incorrect
> transmission, please return the e-mail to the sender and permanently delete
> this message and any attachments.
>
> Messages and attachments are scanned for all known viruses. Always scan
> attachments before opening them.
>



-- 
Regards,
*Stanley Shi,*

Re: Replication factor affecting write performance

Posted by Stanley Shi <ss...@pivotal.io>.
What's the network setup and topology?
Also, the size of the cluster?


On Mon, Sep 1, 2014 at 4:10 PM, Laurens Bronwasser <
Laurens.Bronwasser@imc.nl> wrote:

>  And now with the right label on the Y-axis.
>
>
>   From: Microsoft Office User <la...@imc.nl>
> Date: Monday, September 1, 2014 at 9:56 AM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Cc: Julien Lehuen <ju...@imc.nl>, Tyler McDougall <
> Tyler.McDougall@imc.nl>
> Subject: Replication factor affecting write performance
>
>   Hi,
> We have a setup with two clusters.
> On cluster shows a very strong degradation when we increase the
> replication factor.
> Another cluster shows hardly any degradation with increased replication
> factor.
>
>  Any idea how to find out the bottleneck in the slower cluster?
>
>
>
> ------------------------------
>
> The information in this e-mail is intended only for the person or entity
> to which it is addressed.
>
> It may contain confidential and /or privileged material. If someone other
> than the intended recipient should receive this e-mail, he / she shall not
> be entitled to read, disseminate, disclose or duplicate it.
>
> If you receive this e-mail unintentionally, please inform us immediately
> by "reply" and then delete it from your system. Although this information
> has been compiled with great care, neither IMC Financial Markets & Asset
> Management nor any of its related entities shall accept any responsibility
> for any errors, omissions or other inaccuracies in this information or for
> the consequences thereof, nor shall it be bound in any way by the contents
> of this e-mail or its attachments. In the event of incomplete or incorrect
> transmission, please return the e-mail to the sender and permanently delete
> this message and any attachments.
>
> Messages and attachments are scanned for all known viruses. Always scan
> attachments before opening them.
>



-- 
Regards,
*Stanley Shi,*

Re: Replication factor affecting write performance

Posted by Laurens Bronwasser <La...@imc.nl>.
And now with the right label on the Y-axis.

[cid:7D64F387-C6F1-4B37-9894-0166EC949EF9]

From: Microsoft Office User <la...@imc.nl>>
Date: Monday, September 1, 2014 at 9:56 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Cc: Julien Lehuen <ju...@imc.nl>>, Tyler McDougall <Ty...@imc.nl>>
Subject: Replication factor affecting write performance

Hi,
We have a setup with two clusters.
On cluster shows a very strong degradation when we increase the replication factor.
Another cluster shows hardly any degradation with increased replication factor.

Any idea how to find out the bottleneck in the slower cluster?



________________________________

The information in this e-mail is intended only for the person or entity to which it is addressed.

It may contain confidential and /or privileged material. If someone other than the intended recipient should receive this e-mail, he / she shall not be entitled to read, disseminate, disclose or duplicate it.

If you receive this e-mail unintentionally, please inform us immediately by "reply" and then delete it from your system. Although this information has been compiled with great care, neither IMC Financial Markets & Asset Management nor any of its related entities shall accept any responsibility for any errors, omissions or other inaccuracies in this information or for the consequences thereof, nor shall it be bound in any way by the contents of this e-mail or its attachments. In the event of incomplete or incorrect transmission, please return the e-mail to the sender and permanently delete this message and any attachments.

Messages and attachments are scanned for all known viruses. Always scan attachments before opening them.

Re: Replication factor affecting write performance

Posted by Laurens Bronwasser <La...@imc.nl>.
And now with the right label on the Y-axis.

[cid:7D64F387-C6F1-4B37-9894-0166EC949EF9]

From: Microsoft Office User <la...@imc.nl>>
Date: Monday, September 1, 2014 at 9:56 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Cc: Julien Lehuen <ju...@imc.nl>>, Tyler McDougall <Ty...@imc.nl>>
Subject: Replication factor affecting write performance

Hi,
We have a setup with two clusters.
On cluster shows a very strong degradation when we increase the replication factor.
Another cluster shows hardly any degradation with increased replication factor.

Any idea how to find out the bottleneck in the slower cluster?



________________________________

The information in this e-mail is intended only for the person or entity to which it is addressed.

It may contain confidential and /or privileged material. If someone other than the intended recipient should receive this e-mail, he / she shall not be entitled to read, disseminate, disclose or duplicate it.

If you receive this e-mail unintentionally, please inform us immediately by "reply" and then delete it from your system. Although this information has been compiled with great care, neither IMC Financial Markets & Asset Management nor any of its related entities shall accept any responsibility for any errors, omissions or other inaccuracies in this information or for the consequences thereof, nor shall it be bound in any way by the contents of this e-mail or its attachments. In the event of incomplete or incorrect transmission, please return the e-mail to the sender and permanently delete this message and any attachments.

Messages and attachments are scanned for all known viruses. Always scan attachments before opening them.

Re: Replication factor affecting write performance

Posted by Laurens Bronwasser <La...@imc.nl>.
And now with the right label on the Y-axis.

[cid:7D64F387-C6F1-4B37-9894-0166EC949EF9]

From: Microsoft Office User <la...@imc.nl>>
Date: Monday, September 1, 2014 at 9:56 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Cc: Julien Lehuen <ju...@imc.nl>>, Tyler McDougall <Ty...@imc.nl>>
Subject: Replication factor affecting write performance

Hi,
We have a setup with two clusters.
On cluster shows a very strong degradation when we increase the replication factor.
Another cluster shows hardly any degradation with increased replication factor.

Any idea how to find out the bottleneck in the slower cluster?



________________________________

The information in this e-mail is intended only for the person or entity to which it is addressed.

It may contain confidential and /or privileged material. If someone other than the intended recipient should receive this e-mail, he / she shall not be entitled to read, disseminate, disclose or duplicate it.

If you receive this e-mail unintentionally, please inform us immediately by "reply" and then delete it from your system. Although this information has been compiled with great care, neither IMC Financial Markets & Asset Management nor any of its related entities shall accept any responsibility for any errors, omissions or other inaccuracies in this information or for the consequences thereof, nor shall it be bound in any way by the contents of this e-mail or its attachments. In the event of incomplete or incorrect transmission, please return the e-mail to the sender and permanently delete this message and any attachments.

Messages and attachments are scanned for all known viruses. Always scan attachments before opening them.

Re: Replication factor affecting write performance

Posted by Laurens Bronwasser <La...@imc.nl>.
And now with the right label on the Y-axis.

[cid:7D64F387-C6F1-4B37-9894-0166EC949EF9]

From: Microsoft Office User <la...@imc.nl>>
Date: Monday, September 1, 2014 at 9:56 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Cc: Julien Lehuen <ju...@imc.nl>>, Tyler McDougall <Ty...@imc.nl>>
Subject: Replication factor affecting write performance

Hi,
We have a setup with two clusters.
On cluster shows a very strong degradation when we increase the replication factor.
Another cluster shows hardly any degradation with increased replication factor.

Any idea how to find out the bottleneck in the slower cluster?



________________________________

The information in this e-mail is intended only for the person or entity to which it is addressed.

It may contain confidential and /or privileged material. If someone other than the intended recipient should receive this e-mail, he / she shall not be entitled to read, disseminate, disclose or duplicate it.

If you receive this e-mail unintentionally, please inform us immediately by "reply" and then delete it from your system. Although this information has been compiled with great care, neither IMC Financial Markets & Asset Management nor any of its related entities shall accept any responsibility for any errors, omissions or other inaccuracies in this information or for the consequences thereof, nor shall it be bound in any way by the contents of this e-mail or its attachments. In the event of incomplete or incorrect transmission, please return the e-mail to the sender and permanently delete this message and any attachments.

Messages and attachments are scanned for all known viruses. Always scan attachments before opening them.