You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by lei liu <li...@gmail.com> on 2013/08/12 12:12:39 UTC
when Standby Namenode is doing checkpoint, the Active NameNode is slow.
When Standby Namenode is doing checkpoint, upload the image file to Active
NameNode, the Active NameNode is very slow. What is reason result to the
Active NameNode is slow?
Thanks,
LiuLei
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Posted by lei liu <li...@gmail.com>.
Hi Jitendra,
I don't use the compression parameter.
My network card is 100M/s, and I set the "dfs.image.transfer.bandwidthPerSec."
to 50M, so I think Active NameNode still has 50M bandwidth to be used to
handle RPC request, why the OPS dropped by 50%?
2013/8/15 Jitendra Yadav <je...@gmail.com>
> Hi,
> Looks like you got some pace, did you also tried with compression
> parameter? I think you will get more optimization with it. Also file
> transfer speed depends on our network bandwidth between PNN/SNN and network
> traffic b/w nodes.What's your network conf?
> Thanks
>
>
> On Wed, Aug 14, 2013 at 11:39 AM, lei liu <li...@gmail.com> wrote:
>
>> I set the "dfs.image.transfer.bandwidthPerSec." to 50M, and the
>> performance is below:
>>
>> 2013-08-14 12:32:33,079 INFO my.EditLogPerformance: totalCount:1342440
>> speed:1111
>> 2013-08-14 12:32:43,082 INFO my.EditLogPerformance: totalCount:1363338
>> speed:1044
>> 2013-08-14 12:32:53,085 INFO my.EditLogPerformance: totalCount:1385526
>> speed:1109
>> *2013-08-14 12:33:03,087 INFO my.EditLogPerformance: totalCount:1396324
>> speed:539*
>> *2013-08-14 12:33:13,090 INFO my.EditLogPerformance: totalCount:1406232
>> speed:495
>> 2013-08-14 12:33:23,093 INFO my.EditLogPerformance: totalCount:1415006
>> speed:438
>> 2013-08-14 12:33:33,096 INFO my.EditLogPerformance: totalCount:1423952
>> speed:447*
>> *2013-08-14 12:33:43,099 INFO my.EditLogPerformance: totalCount:1437256
>> speed:665*
>> 2013-08-14 12:33:53,102 INFO my.EditLogPerformance: totalCount:1458378
>> speed:1056
>> 2013-08-14 12:34:03,106 INFO my.EditLogPerformance: totalCount:1479338
>> speed:1048
>> 2013-08-14 12:34:13,108 INFO my.EditLogPerformance: totalCount:1500400
>> speed:1053
>> 2013-08-14 12:34:23,111 INFO my.EditLogPerformance: totalCount:1521252
>> speed:1042
>> 2013-08-14 12:34:33,114 INFO my.EditLogPerformance: totalCount:1542286
>> speed:1051
>> 2013-08-14 12:34:43,117 INFO my.EditLogPerformance: totalCount:1562956
>> speed:1033
>> 2013-08-14 12:34:53,120 INFO my.EditLogPerformance: totalCount:1583804
>> speed:1042
>> 2013-08-14 12:35:03,123 INFO my.EditLogPerformance: totalCount:1606558
>> speed:1137
>> 2013-08-14 12:35:13,126 INFO my.EditLogPerformance: totalCount:1627980
>> speed:1071
>> 2013-08-14 12:35:23,129 INFO my.EditLogPerformance: totalCount:1650642
>> speed:1133
>> 2013-08-14 12:35:33,132 INFO my.EditLogPerformance: totalCount:1672806
>> speed:1108
>> 2013-08-14 12:35:43,134 INFO my.EditLogPerformance: totalCount:1693940
>> speed:1056
>> 2013-08-14 12:35:53,137 INFO my.EditLogPerformance: totalCount:1715430
>> speed:1074
>> 2013-08-14 12:36:03,140 INFO my.EditLogPerformance: totalCount:1737940
>> speed:1125
>> 2013-08-14 12:36:13,143 INFO my.EditLogPerformance: totalCount:1760094
>> speed:1107
>> 2013-08-14 12:36:23,146 INFO my.EditLogPerformance: totalCount:1781646
>> speed:1077
>> 2013-08-14 12:36:33,149 INFO my.EditLogPerformance: totalCount:1802230
>> speed:1029
>> 2013-08-14 12:36:43,152 INFO my.EditLogPerformance: totalCount:1824132
>> speed:1095
>> 2013-08-14 12:36:53,155 INFO my.EditLogPerformance: totalCount:1846778
>> speed:1132
>> 2013-08-14 12:37:03,158 INFO my.EditLogPerformance: totalCount:1868956
>> speed:1108
>> 2013-08-14 12:37:13,161 INFO my.EditLogPerformance: totalCount:1888556
>> speed:980
>> 2013-08-14 12:37:23,164 INFO my.EditLogPerformance: totalCount:1910512
>> speed:1097
>> 2013-08-14 12:37:33,167 INFO my.EditLogPerformance: totalCount:1932240
>> speed:1086
>> 2013-08-14 12:37:43,170 INFO my.EditLogPerformance: totalCount:1954226
>> speed:1099
>> 2013-08-14 12:37:53,173 INFO my.EditLogPerformance: totalCount:1974706
>> speed:1024
>> 2013-08-14 12:38:03,176 INFO my.EditLogPerformance: totalCount:1993906
>> speed:960
>> 2013-08-14 12:38:13,179 INFO my.EditLogPerformance: totalCount:2014172
>> speed:1013
>> 2013-08-14 12:38:23,182 INFO my.EditLogPerformance: totalCount:2036130
>> speed:1097
>> 2013-08-14 12:38:33,184 INFO my.EditLogPerformance: totalCount:2057848
>> speed:1085
>> 2013-08-14 12:38:43,187 INFO my.EditLogPerformance: totalCount:2078834
>> speed:1049
>> 2013-08-14 12:38:53,190 INFO my.EditLogPerformance: totalCount:2095616
>> speed:839
>> *2013-08-14 12:39:03,193 INFO my.EditLogPerformance: totalCount:2104896
>> speed:464
>> 2013-08-14 12:39:13,196 INFO my.EditLogPerformance: totalCount:2114572
>> speed:483
>> 2013-08-14 12:39:23,199 INFO my.EditLogPerformance: totalCount:2123512
>> speed:447*
>> *2013-08-14 12:39:33,202 INFO my.EditLogPerformance: totalCount:2133604
>> speed:504*
>> 2013-08-14 12:39:43,205 INFO my.EditLogPerformance: totalCount:2149792
>> speed:809
>>
>>
>>
>> The there are below info in Active NameNode:
>> 2013-08-14 12:44:47,301 INFO
>> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening connection
>> to
>> http://dw78.kgb.sqa.cm4:20021/getimage?getimage=1&txid=655178418&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
>> 2013-08-14 12:48:57,529 INFO
>> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: *Transfer took
>> 250.23s at 10280.59 KB/s*
>> 2013-08-14 12:48:57,530 INFO
>> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Downloaded file
>> fsimage.ckpt_0000000000655178418 size 2634222089 bytes
>>
>>
>> The there are below info in StadnbyNameNode:
>> 2013-08-14 12:43:57,924 INFO
>> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer: Triggering
>> checkpoint because there have been 2421083 txns since the last checkpoint,
>> which exceeds the configured threshold 1000000
>> 2013-08-14 12:43:57,925 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSImage: Saving image file
>> /home/musa.ll/hadoop2/cluster-data/name/current/fsimage.ckpt_0000000000655178418
>> using no compression
>> 2013-08-14 12:48:58,044 INFO
>> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
>> 250.75s at 0.00 KB/s
>> 2013-08-14 12:48:58,045 INFO
>> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Uploaded image with
>> txid 655178418 to namenode at 10.232.98.77:20021
>>
>>
>>
>> *When Active NameNode download the fsimage, the OPS is less than 500, the
>> OPS dropped by 50%.*
>>
>> The "checkpoint.png" attachment is monitoring info in ganglia.
>>
>>
>> I have two questions:
>>
>> - *first question:* The Active NameNode's bandwidth is occupied by
>> 50%, Active NameNode still has 50M bandwidth to be used to handle
>> RPC request, why the OPS dropped by 50%?
>>
>>
>> - *second question:* The fsimage file is 2634222089 bytes, Active
>> NameNode spend 240s to download the file. Yhe
>> "dfs.image.transfer.bandwidthPerSec" value is 50M, I think the download
>> time should is about 50s.
>>
>>
>> Thansk,
>>
>> LiuLei
>>
>>
>>
>>
>>
>
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Posted by lei liu <li...@gmail.com>.
Hi Jitendra,
I don't use the compression parameter.
My network card is 100M/s, and I set the "dfs.image.transfer.bandwidthPerSec."
to 50M, so I think Active NameNode still has 50M bandwidth to be used to
handle RPC request, why the OPS dropped by 50%?
2013/8/15 Jitendra Yadav <je...@gmail.com>
> Hi,
> Looks like you got some pace, did you also tried with compression
> parameter? I think you will get more optimization with it. Also file
> transfer speed depends on our network bandwidth between PNN/SNN and network
> traffic b/w nodes.What's your network conf?
> Thanks
>
>
> On Wed, Aug 14, 2013 at 11:39 AM, lei liu <li...@gmail.com> wrote:
>
>> I set the "dfs.image.transfer.bandwidthPerSec." to 50M, and the
>> performance is below:
>>
>> 2013-08-14 12:32:33,079 INFO my.EditLogPerformance: totalCount:1342440
>> speed:1111
>> 2013-08-14 12:32:43,082 INFO my.EditLogPerformance: totalCount:1363338
>> speed:1044
>> 2013-08-14 12:32:53,085 INFO my.EditLogPerformance: totalCount:1385526
>> speed:1109
>> *2013-08-14 12:33:03,087 INFO my.EditLogPerformance: totalCount:1396324
>> speed:539*
>> *2013-08-14 12:33:13,090 INFO my.EditLogPerformance: totalCount:1406232
>> speed:495
>> 2013-08-14 12:33:23,093 INFO my.EditLogPerformance: totalCount:1415006
>> speed:438
>> 2013-08-14 12:33:33,096 INFO my.EditLogPerformance: totalCount:1423952
>> speed:447*
>> *2013-08-14 12:33:43,099 INFO my.EditLogPerformance: totalCount:1437256
>> speed:665*
>> 2013-08-14 12:33:53,102 INFO my.EditLogPerformance: totalCount:1458378
>> speed:1056
>> 2013-08-14 12:34:03,106 INFO my.EditLogPerformance: totalCount:1479338
>> speed:1048
>> 2013-08-14 12:34:13,108 INFO my.EditLogPerformance: totalCount:1500400
>> speed:1053
>> 2013-08-14 12:34:23,111 INFO my.EditLogPerformance: totalCount:1521252
>> speed:1042
>> 2013-08-14 12:34:33,114 INFO my.EditLogPerformance: totalCount:1542286
>> speed:1051
>> 2013-08-14 12:34:43,117 INFO my.EditLogPerformance: totalCount:1562956
>> speed:1033
>> 2013-08-14 12:34:53,120 INFO my.EditLogPerformance: totalCount:1583804
>> speed:1042
>> 2013-08-14 12:35:03,123 INFO my.EditLogPerformance: totalCount:1606558
>> speed:1137
>> 2013-08-14 12:35:13,126 INFO my.EditLogPerformance: totalCount:1627980
>> speed:1071
>> 2013-08-14 12:35:23,129 INFO my.EditLogPerformance: totalCount:1650642
>> speed:1133
>> 2013-08-14 12:35:33,132 INFO my.EditLogPerformance: totalCount:1672806
>> speed:1108
>> 2013-08-14 12:35:43,134 INFO my.EditLogPerformance: totalCount:1693940
>> speed:1056
>> 2013-08-14 12:35:53,137 INFO my.EditLogPerformance: totalCount:1715430
>> speed:1074
>> 2013-08-14 12:36:03,140 INFO my.EditLogPerformance: totalCount:1737940
>> speed:1125
>> 2013-08-14 12:36:13,143 INFO my.EditLogPerformance: totalCount:1760094
>> speed:1107
>> 2013-08-14 12:36:23,146 INFO my.EditLogPerformance: totalCount:1781646
>> speed:1077
>> 2013-08-14 12:36:33,149 INFO my.EditLogPerformance: totalCount:1802230
>> speed:1029
>> 2013-08-14 12:36:43,152 INFO my.EditLogPerformance: totalCount:1824132
>> speed:1095
>> 2013-08-14 12:36:53,155 INFO my.EditLogPerformance: totalCount:1846778
>> speed:1132
>> 2013-08-14 12:37:03,158 INFO my.EditLogPerformance: totalCount:1868956
>> speed:1108
>> 2013-08-14 12:37:13,161 INFO my.EditLogPerformance: totalCount:1888556
>> speed:980
>> 2013-08-14 12:37:23,164 INFO my.EditLogPerformance: totalCount:1910512
>> speed:1097
>> 2013-08-14 12:37:33,167 INFO my.EditLogPerformance: totalCount:1932240
>> speed:1086
>> 2013-08-14 12:37:43,170 INFO my.EditLogPerformance: totalCount:1954226
>> speed:1099
>> 2013-08-14 12:37:53,173 INFO my.EditLogPerformance: totalCount:1974706
>> speed:1024
>> 2013-08-14 12:38:03,176 INFO my.EditLogPerformance: totalCount:1993906
>> speed:960
>> 2013-08-14 12:38:13,179 INFO my.EditLogPerformance: totalCount:2014172
>> speed:1013
>> 2013-08-14 12:38:23,182 INFO my.EditLogPerformance: totalCount:2036130
>> speed:1097
>> 2013-08-14 12:38:33,184 INFO my.EditLogPerformance: totalCount:2057848
>> speed:1085
>> 2013-08-14 12:38:43,187 INFO my.EditLogPerformance: totalCount:2078834
>> speed:1049
>> 2013-08-14 12:38:53,190 INFO my.EditLogPerformance: totalCount:2095616
>> speed:839
>> *2013-08-14 12:39:03,193 INFO my.EditLogPerformance: totalCount:2104896
>> speed:464
>> 2013-08-14 12:39:13,196 INFO my.EditLogPerformance: totalCount:2114572
>> speed:483
>> 2013-08-14 12:39:23,199 INFO my.EditLogPerformance: totalCount:2123512
>> speed:447*
>> *2013-08-14 12:39:33,202 INFO my.EditLogPerformance: totalCount:2133604
>> speed:504*
>> 2013-08-14 12:39:43,205 INFO my.EditLogPerformance: totalCount:2149792
>> speed:809
>>
>>
>>
>> The there are below info in Active NameNode:
>> 2013-08-14 12:44:47,301 INFO
>> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening connection
>> to
>> http://dw78.kgb.sqa.cm4:20021/getimage?getimage=1&txid=655178418&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
>> 2013-08-14 12:48:57,529 INFO
>> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: *Transfer took
>> 250.23s at 10280.59 KB/s*
>> 2013-08-14 12:48:57,530 INFO
>> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Downloaded file
>> fsimage.ckpt_0000000000655178418 size 2634222089 bytes
>>
>>
>> The there are below info in StadnbyNameNode:
>> 2013-08-14 12:43:57,924 INFO
>> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer: Triggering
>> checkpoint because there have been 2421083 txns since the last checkpoint,
>> which exceeds the configured threshold 1000000
>> 2013-08-14 12:43:57,925 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSImage: Saving image file
>> /home/musa.ll/hadoop2/cluster-data/name/current/fsimage.ckpt_0000000000655178418
>> using no compression
>> 2013-08-14 12:48:58,044 INFO
>> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
>> 250.75s at 0.00 KB/s
>> 2013-08-14 12:48:58,045 INFO
>> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Uploaded image with
>> txid 655178418 to namenode at 10.232.98.77:20021
>>
>>
>>
>> *When Active NameNode download the fsimage, the OPS is less than 500, the
>> OPS dropped by 50%.*
>>
>> The "checkpoint.png" attachment is monitoring info in ganglia.
>>
>>
>> I have two questions:
>>
>> - *first question:* The Active NameNode's bandwidth is occupied by
>> 50%, Active NameNode still has 50M bandwidth to be used to handle
>> RPC request, why the OPS dropped by 50%?
>>
>>
>> - *second question:* The fsimage file is 2634222089 bytes, Active
>> NameNode spend 240s to download the file. Yhe
>> "dfs.image.transfer.bandwidthPerSec" value is 50M, I think the download
>> time should is about 50s.
>>
>>
>> Thansk,
>>
>> LiuLei
>>
>>
>>
>>
>>
>
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Posted by lei liu <li...@gmail.com>.
Hi Jitendra,
I don't use the compression parameter.
My network card is 100M/s, and I set the "dfs.image.transfer.bandwidthPerSec."
to 50M, so I think Active NameNode still has 50M bandwidth to be used to
handle RPC request, why the OPS dropped by 50%?
2013/8/15 Jitendra Yadav <je...@gmail.com>
> Hi,
> Looks like you got some pace, did you also tried with compression
> parameter? I think you will get more optimization with it. Also file
> transfer speed depends on our network bandwidth between PNN/SNN and network
> traffic b/w nodes.What's your network conf?
> Thanks
>
>
> On Wed, Aug 14, 2013 at 11:39 AM, lei liu <li...@gmail.com> wrote:
>
>> I set the "dfs.image.transfer.bandwidthPerSec." to 50M, and the
>> performance is below:
>>
>> 2013-08-14 12:32:33,079 INFO my.EditLogPerformance: totalCount:1342440
>> speed:1111
>> 2013-08-14 12:32:43,082 INFO my.EditLogPerformance: totalCount:1363338
>> speed:1044
>> 2013-08-14 12:32:53,085 INFO my.EditLogPerformance: totalCount:1385526
>> speed:1109
>> *2013-08-14 12:33:03,087 INFO my.EditLogPerformance: totalCount:1396324
>> speed:539*
>> *2013-08-14 12:33:13,090 INFO my.EditLogPerformance: totalCount:1406232
>> speed:495
>> 2013-08-14 12:33:23,093 INFO my.EditLogPerformance: totalCount:1415006
>> speed:438
>> 2013-08-14 12:33:33,096 INFO my.EditLogPerformance: totalCount:1423952
>> speed:447*
>> *2013-08-14 12:33:43,099 INFO my.EditLogPerformance: totalCount:1437256
>> speed:665*
>> 2013-08-14 12:33:53,102 INFO my.EditLogPerformance: totalCount:1458378
>> speed:1056
>> 2013-08-14 12:34:03,106 INFO my.EditLogPerformance: totalCount:1479338
>> speed:1048
>> 2013-08-14 12:34:13,108 INFO my.EditLogPerformance: totalCount:1500400
>> speed:1053
>> 2013-08-14 12:34:23,111 INFO my.EditLogPerformance: totalCount:1521252
>> speed:1042
>> 2013-08-14 12:34:33,114 INFO my.EditLogPerformance: totalCount:1542286
>> speed:1051
>> 2013-08-14 12:34:43,117 INFO my.EditLogPerformance: totalCount:1562956
>> speed:1033
>> 2013-08-14 12:34:53,120 INFO my.EditLogPerformance: totalCount:1583804
>> speed:1042
>> 2013-08-14 12:35:03,123 INFO my.EditLogPerformance: totalCount:1606558
>> speed:1137
>> 2013-08-14 12:35:13,126 INFO my.EditLogPerformance: totalCount:1627980
>> speed:1071
>> 2013-08-14 12:35:23,129 INFO my.EditLogPerformance: totalCount:1650642
>> speed:1133
>> 2013-08-14 12:35:33,132 INFO my.EditLogPerformance: totalCount:1672806
>> speed:1108
>> 2013-08-14 12:35:43,134 INFO my.EditLogPerformance: totalCount:1693940
>> speed:1056
>> 2013-08-14 12:35:53,137 INFO my.EditLogPerformance: totalCount:1715430
>> speed:1074
>> 2013-08-14 12:36:03,140 INFO my.EditLogPerformance: totalCount:1737940
>> speed:1125
>> 2013-08-14 12:36:13,143 INFO my.EditLogPerformance: totalCount:1760094
>> speed:1107
>> 2013-08-14 12:36:23,146 INFO my.EditLogPerformance: totalCount:1781646
>> speed:1077
>> 2013-08-14 12:36:33,149 INFO my.EditLogPerformance: totalCount:1802230
>> speed:1029
>> 2013-08-14 12:36:43,152 INFO my.EditLogPerformance: totalCount:1824132
>> speed:1095
>> 2013-08-14 12:36:53,155 INFO my.EditLogPerformance: totalCount:1846778
>> speed:1132
>> 2013-08-14 12:37:03,158 INFO my.EditLogPerformance: totalCount:1868956
>> speed:1108
>> 2013-08-14 12:37:13,161 INFO my.EditLogPerformance: totalCount:1888556
>> speed:980
>> 2013-08-14 12:37:23,164 INFO my.EditLogPerformance: totalCount:1910512
>> speed:1097
>> 2013-08-14 12:37:33,167 INFO my.EditLogPerformance: totalCount:1932240
>> speed:1086
>> 2013-08-14 12:37:43,170 INFO my.EditLogPerformance: totalCount:1954226
>> speed:1099
>> 2013-08-14 12:37:53,173 INFO my.EditLogPerformance: totalCount:1974706
>> speed:1024
>> 2013-08-14 12:38:03,176 INFO my.EditLogPerformance: totalCount:1993906
>> speed:960
>> 2013-08-14 12:38:13,179 INFO my.EditLogPerformance: totalCount:2014172
>> speed:1013
>> 2013-08-14 12:38:23,182 INFO my.EditLogPerformance: totalCount:2036130
>> speed:1097
>> 2013-08-14 12:38:33,184 INFO my.EditLogPerformance: totalCount:2057848
>> speed:1085
>> 2013-08-14 12:38:43,187 INFO my.EditLogPerformance: totalCount:2078834
>> speed:1049
>> 2013-08-14 12:38:53,190 INFO my.EditLogPerformance: totalCount:2095616
>> speed:839
>> *2013-08-14 12:39:03,193 INFO my.EditLogPerformance: totalCount:2104896
>> speed:464
>> 2013-08-14 12:39:13,196 INFO my.EditLogPerformance: totalCount:2114572
>> speed:483
>> 2013-08-14 12:39:23,199 INFO my.EditLogPerformance: totalCount:2123512
>> speed:447*
>> *2013-08-14 12:39:33,202 INFO my.EditLogPerformance: totalCount:2133604
>> speed:504*
>> 2013-08-14 12:39:43,205 INFO my.EditLogPerformance: totalCount:2149792
>> speed:809
>>
>>
>>
>> The there are below info in Active NameNode:
>> 2013-08-14 12:44:47,301 INFO
>> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening connection
>> to
>> http://dw78.kgb.sqa.cm4:20021/getimage?getimage=1&txid=655178418&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
>> 2013-08-14 12:48:57,529 INFO
>> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: *Transfer took
>> 250.23s at 10280.59 KB/s*
>> 2013-08-14 12:48:57,530 INFO
>> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Downloaded file
>> fsimage.ckpt_0000000000655178418 size 2634222089 bytes
>>
>>
>> The there are below info in StadnbyNameNode:
>> 2013-08-14 12:43:57,924 INFO
>> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer: Triggering
>> checkpoint because there have been 2421083 txns since the last checkpoint,
>> which exceeds the configured threshold 1000000
>> 2013-08-14 12:43:57,925 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSImage: Saving image file
>> /home/musa.ll/hadoop2/cluster-data/name/current/fsimage.ckpt_0000000000655178418
>> using no compression
>> 2013-08-14 12:48:58,044 INFO
>> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
>> 250.75s at 0.00 KB/s
>> 2013-08-14 12:48:58,045 INFO
>> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Uploaded image with
>> txid 655178418 to namenode at 10.232.98.77:20021
>>
>>
>>
>> *When Active NameNode download the fsimage, the OPS is less than 500, the
>> OPS dropped by 50%.*
>>
>> The "checkpoint.png" attachment is monitoring info in ganglia.
>>
>>
>> I have two questions:
>>
>> - *first question:* The Active NameNode's bandwidth is occupied by
>> 50%, Active NameNode still has 50M bandwidth to be used to handle
>> RPC request, why the OPS dropped by 50%?
>>
>>
>> - *second question:* The fsimage file is 2634222089 bytes, Active
>> NameNode spend 240s to download the file. Yhe
>> "dfs.image.transfer.bandwidthPerSec" value is 50M, I think the download
>> time should is about 50s.
>>
>>
>> Thansk,
>>
>> LiuLei
>>
>>
>>
>>
>>
>
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Posted by lei liu <li...@gmail.com>.
Hi Jitendra,
I don't use the compression parameter.
My network card is 100M/s, and I set the "dfs.image.transfer.bandwidthPerSec."
to 50M, so I think Active NameNode still has 50M bandwidth to be used to
handle RPC request, why the OPS dropped by 50%?
2013/8/15 Jitendra Yadav <je...@gmail.com>
> Hi,
> Looks like you got some pace, did you also tried with compression
> parameter? I think you will get more optimization with it. Also file
> transfer speed depends on our network bandwidth between PNN/SNN and network
> traffic b/w nodes.What's your network conf?
> Thanks
>
>
> On Wed, Aug 14, 2013 at 11:39 AM, lei liu <li...@gmail.com> wrote:
>
>> I set the "dfs.image.transfer.bandwidthPerSec." to 50M, and the
>> performance is below:
>>
>> 2013-08-14 12:32:33,079 INFO my.EditLogPerformance: totalCount:1342440
>> speed:1111
>> 2013-08-14 12:32:43,082 INFO my.EditLogPerformance: totalCount:1363338
>> speed:1044
>> 2013-08-14 12:32:53,085 INFO my.EditLogPerformance: totalCount:1385526
>> speed:1109
>> *2013-08-14 12:33:03,087 INFO my.EditLogPerformance: totalCount:1396324
>> speed:539*
>> *2013-08-14 12:33:13,090 INFO my.EditLogPerformance: totalCount:1406232
>> speed:495
>> 2013-08-14 12:33:23,093 INFO my.EditLogPerformance: totalCount:1415006
>> speed:438
>> 2013-08-14 12:33:33,096 INFO my.EditLogPerformance: totalCount:1423952
>> speed:447*
>> *2013-08-14 12:33:43,099 INFO my.EditLogPerformance: totalCount:1437256
>> speed:665*
>> 2013-08-14 12:33:53,102 INFO my.EditLogPerformance: totalCount:1458378
>> speed:1056
>> 2013-08-14 12:34:03,106 INFO my.EditLogPerformance: totalCount:1479338
>> speed:1048
>> 2013-08-14 12:34:13,108 INFO my.EditLogPerformance: totalCount:1500400
>> speed:1053
>> 2013-08-14 12:34:23,111 INFO my.EditLogPerformance: totalCount:1521252
>> speed:1042
>> 2013-08-14 12:34:33,114 INFO my.EditLogPerformance: totalCount:1542286
>> speed:1051
>> 2013-08-14 12:34:43,117 INFO my.EditLogPerformance: totalCount:1562956
>> speed:1033
>> 2013-08-14 12:34:53,120 INFO my.EditLogPerformance: totalCount:1583804
>> speed:1042
>> 2013-08-14 12:35:03,123 INFO my.EditLogPerformance: totalCount:1606558
>> speed:1137
>> 2013-08-14 12:35:13,126 INFO my.EditLogPerformance: totalCount:1627980
>> speed:1071
>> 2013-08-14 12:35:23,129 INFO my.EditLogPerformance: totalCount:1650642
>> speed:1133
>> 2013-08-14 12:35:33,132 INFO my.EditLogPerformance: totalCount:1672806
>> speed:1108
>> 2013-08-14 12:35:43,134 INFO my.EditLogPerformance: totalCount:1693940
>> speed:1056
>> 2013-08-14 12:35:53,137 INFO my.EditLogPerformance: totalCount:1715430
>> speed:1074
>> 2013-08-14 12:36:03,140 INFO my.EditLogPerformance: totalCount:1737940
>> speed:1125
>> 2013-08-14 12:36:13,143 INFO my.EditLogPerformance: totalCount:1760094
>> speed:1107
>> 2013-08-14 12:36:23,146 INFO my.EditLogPerformance: totalCount:1781646
>> speed:1077
>> 2013-08-14 12:36:33,149 INFO my.EditLogPerformance: totalCount:1802230
>> speed:1029
>> 2013-08-14 12:36:43,152 INFO my.EditLogPerformance: totalCount:1824132
>> speed:1095
>> 2013-08-14 12:36:53,155 INFO my.EditLogPerformance: totalCount:1846778
>> speed:1132
>> 2013-08-14 12:37:03,158 INFO my.EditLogPerformance: totalCount:1868956
>> speed:1108
>> 2013-08-14 12:37:13,161 INFO my.EditLogPerformance: totalCount:1888556
>> speed:980
>> 2013-08-14 12:37:23,164 INFO my.EditLogPerformance: totalCount:1910512
>> speed:1097
>> 2013-08-14 12:37:33,167 INFO my.EditLogPerformance: totalCount:1932240
>> speed:1086
>> 2013-08-14 12:37:43,170 INFO my.EditLogPerformance: totalCount:1954226
>> speed:1099
>> 2013-08-14 12:37:53,173 INFO my.EditLogPerformance: totalCount:1974706
>> speed:1024
>> 2013-08-14 12:38:03,176 INFO my.EditLogPerformance: totalCount:1993906
>> speed:960
>> 2013-08-14 12:38:13,179 INFO my.EditLogPerformance: totalCount:2014172
>> speed:1013
>> 2013-08-14 12:38:23,182 INFO my.EditLogPerformance: totalCount:2036130
>> speed:1097
>> 2013-08-14 12:38:33,184 INFO my.EditLogPerformance: totalCount:2057848
>> speed:1085
>> 2013-08-14 12:38:43,187 INFO my.EditLogPerformance: totalCount:2078834
>> speed:1049
>> 2013-08-14 12:38:53,190 INFO my.EditLogPerformance: totalCount:2095616
>> speed:839
>> *2013-08-14 12:39:03,193 INFO my.EditLogPerformance: totalCount:2104896
>> speed:464
>> 2013-08-14 12:39:13,196 INFO my.EditLogPerformance: totalCount:2114572
>> speed:483
>> 2013-08-14 12:39:23,199 INFO my.EditLogPerformance: totalCount:2123512
>> speed:447*
>> *2013-08-14 12:39:33,202 INFO my.EditLogPerformance: totalCount:2133604
>> speed:504*
>> 2013-08-14 12:39:43,205 INFO my.EditLogPerformance: totalCount:2149792
>> speed:809
>>
>>
>>
>> The there are below info in Active NameNode:
>> 2013-08-14 12:44:47,301 INFO
>> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening connection
>> to
>> http://dw78.kgb.sqa.cm4:20021/getimage?getimage=1&txid=655178418&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
>> 2013-08-14 12:48:57,529 INFO
>> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: *Transfer took
>> 250.23s at 10280.59 KB/s*
>> 2013-08-14 12:48:57,530 INFO
>> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Downloaded file
>> fsimage.ckpt_0000000000655178418 size 2634222089 bytes
>>
>>
>> The there are below info in StadnbyNameNode:
>> 2013-08-14 12:43:57,924 INFO
>> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer: Triggering
>> checkpoint because there have been 2421083 txns since the last checkpoint,
>> which exceeds the configured threshold 1000000
>> 2013-08-14 12:43:57,925 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSImage: Saving image file
>> /home/musa.ll/hadoop2/cluster-data/name/current/fsimage.ckpt_0000000000655178418
>> using no compression
>> 2013-08-14 12:48:58,044 INFO
>> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
>> 250.75s at 0.00 KB/s
>> 2013-08-14 12:48:58,045 INFO
>> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Uploaded image with
>> txid 655178418 to namenode at 10.232.98.77:20021
>>
>>
>>
>> *When Active NameNode download the fsimage, the OPS is less than 500, the
>> OPS dropped by 50%.*
>>
>> The "checkpoint.png" attachment is monitoring info in ganglia.
>>
>>
>> I have two questions:
>>
>> - *first question:* The Active NameNode's bandwidth is occupied by
>> 50%, Active NameNode still has 50M bandwidth to be used to handle
>> RPC request, why the OPS dropped by 50%?
>>
>>
>> - *second question:* The fsimage file is 2634222089 bytes, Active
>> NameNode spend 240s to download the file. Yhe
>> "dfs.image.transfer.bandwidthPerSec" value is 50M, I think the download
>> time should is about 50s.
>>
>>
>> Thansk,
>>
>> LiuLei
>>
>>
>>
>>
>>
>
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Posted by Jitendra Yadav <je...@gmail.com>.
Hi,
Looks like you got some pace, did you also tried with compression
parameter? I think you will get more optimization with it. Also file
transfer speed depends on our network bandwidth between PNN/SNN and network
traffic b/w nodes.What's your network conf?
Thanks
On Wed, Aug 14, 2013 at 11:39 AM, lei liu <li...@gmail.com> wrote:
> I set the "dfs.image.transfer.bandwidthPerSec." to 50M, and the
> performance is below:
>
> 2013-08-14 12:32:33,079 INFO my.EditLogPerformance: totalCount:1342440
> speed:1111
> 2013-08-14 12:32:43,082 INFO my.EditLogPerformance: totalCount:1363338
> speed:1044
> 2013-08-14 12:32:53,085 INFO my.EditLogPerformance: totalCount:1385526
> speed:1109
> *2013-08-14 12:33:03,087 INFO my.EditLogPerformance: totalCount:1396324
> speed:539*
> *2013-08-14 12:33:13,090 INFO my.EditLogPerformance: totalCount:1406232
> speed:495
> 2013-08-14 12:33:23,093 INFO my.EditLogPerformance: totalCount:1415006
> speed:438
> 2013-08-14 12:33:33,096 INFO my.EditLogPerformance: totalCount:1423952
> speed:447*
> *2013-08-14 12:33:43,099 INFO my.EditLogPerformance: totalCount:1437256
> speed:665*
> 2013-08-14 12:33:53,102 INFO my.EditLogPerformance: totalCount:1458378
> speed:1056
> 2013-08-14 12:34:03,106 INFO my.EditLogPerformance: totalCount:1479338
> speed:1048
> 2013-08-14 12:34:13,108 INFO my.EditLogPerformance: totalCount:1500400
> speed:1053
> 2013-08-14 12:34:23,111 INFO my.EditLogPerformance: totalCount:1521252
> speed:1042
> 2013-08-14 12:34:33,114 INFO my.EditLogPerformance: totalCount:1542286
> speed:1051
> 2013-08-14 12:34:43,117 INFO my.EditLogPerformance: totalCount:1562956
> speed:1033
> 2013-08-14 12:34:53,120 INFO my.EditLogPerformance: totalCount:1583804
> speed:1042
> 2013-08-14 12:35:03,123 INFO my.EditLogPerformance: totalCount:1606558
> speed:1137
> 2013-08-14 12:35:13,126 INFO my.EditLogPerformance: totalCount:1627980
> speed:1071
> 2013-08-14 12:35:23,129 INFO my.EditLogPerformance: totalCount:1650642
> speed:1133
> 2013-08-14 12:35:33,132 INFO my.EditLogPerformance: totalCount:1672806
> speed:1108
> 2013-08-14 12:35:43,134 INFO my.EditLogPerformance: totalCount:1693940
> speed:1056
> 2013-08-14 12:35:53,137 INFO my.EditLogPerformance: totalCount:1715430
> speed:1074
> 2013-08-14 12:36:03,140 INFO my.EditLogPerformance: totalCount:1737940
> speed:1125
> 2013-08-14 12:36:13,143 INFO my.EditLogPerformance: totalCount:1760094
> speed:1107
> 2013-08-14 12:36:23,146 INFO my.EditLogPerformance: totalCount:1781646
> speed:1077
> 2013-08-14 12:36:33,149 INFO my.EditLogPerformance: totalCount:1802230
> speed:1029
> 2013-08-14 12:36:43,152 INFO my.EditLogPerformance: totalCount:1824132
> speed:1095
> 2013-08-14 12:36:53,155 INFO my.EditLogPerformance: totalCount:1846778
> speed:1132
> 2013-08-14 12:37:03,158 INFO my.EditLogPerformance: totalCount:1868956
> speed:1108
> 2013-08-14 12:37:13,161 INFO my.EditLogPerformance: totalCount:1888556
> speed:980
> 2013-08-14 12:37:23,164 INFO my.EditLogPerformance: totalCount:1910512
> speed:1097
> 2013-08-14 12:37:33,167 INFO my.EditLogPerformance: totalCount:1932240
> speed:1086
> 2013-08-14 12:37:43,170 INFO my.EditLogPerformance: totalCount:1954226
> speed:1099
> 2013-08-14 12:37:53,173 INFO my.EditLogPerformance: totalCount:1974706
> speed:1024
> 2013-08-14 12:38:03,176 INFO my.EditLogPerformance: totalCount:1993906
> speed:960
> 2013-08-14 12:38:13,179 INFO my.EditLogPerformance: totalCount:2014172
> speed:1013
> 2013-08-14 12:38:23,182 INFO my.EditLogPerformance: totalCount:2036130
> speed:1097
> 2013-08-14 12:38:33,184 INFO my.EditLogPerformance: totalCount:2057848
> speed:1085
> 2013-08-14 12:38:43,187 INFO my.EditLogPerformance: totalCount:2078834
> speed:1049
> 2013-08-14 12:38:53,190 INFO my.EditLogPerformance: totalCount:2095616
> speed:839
> *2013-08-14 12:39:03,193 INFO my.EditLogPerformance: totalCount:2104896
> speed:464
> 2013-08-14 12:39:13,196 INFO my.EditLogPerformance: totalCount:2114572
> speed:483
> 2013-08-14 12:39:23,199 INFO my.EditLogPerformance: totalCount:2123512
> speed:447*
> *2013-08-14 12:39:33,202 INFO my.EditLogPerformance: totalCount:2133604
> speed:504*
> 2013-08-14 12:39:43,205 INFO my.EditLogPerformance: totalCount:2149792
> speed:809
>
>
>
> The there are below info in Active NameNode:
> 2013-08-14 12:44:47,301 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening connection
> to
> http://dw78.kgb.sqa.cm4:20021/getimage?getimage=1&txid=655178418&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
> 2013-08-14 12:48:57,529 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: *Transfer took
> 250.23s at 10280.59 KB/s*
> 2013-08-14 12:48:57,530 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Downloaded file
> fsimage.ckpt_0000000000655178418 size 2634222089 bytes
>
>
> The there are below info in StadnbyNameNode:
> 2013-08-14 12:43:57,924 INFO
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer: Triggering
> checkpoint because there have been 2421083 txns since the last checkpoint,
> which exceeds the configured threshold 1000000
> 2013-08-14 12:43:57,925 INFO
> org.apache.hadoop.hdfs.server.namenode.FSImage: Saving image file
> /home/musa.ll/hadoop2/cluster-data/name/current/fsimage.ckpt_0000000000655178418
> using no compression
> 2013-08-14 12:48:58,044 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
> 250.75s at 0.00 KB/s
> 2013-08-14 12:48:58,045 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Uploaded image with
> txid 655178418 to namenode at 10.232.98.77:20021
>
>
>
> *When Active NameNode download the fsimage, the OPS is less than 500, the
> OPS dropped by 50%.*
>
> The "checkpoint.png" attachment is monitoring info in ganglia.
>
>
> I have two questions:
>
> - *first question:* The Active NameNode's bandwidth is occupied by
> 50%, Active NameNode still has 50M bandwidth to be used to handle
> RPC request, why the OPS dropped by 50%?
>
>
> - *second question:* The fsimage file is 2634222089 bytes, Active
> NameNode spend 240s to download the file. Yhe
> "dfs.image.transfer.bandwidthPerSec" value is 50M, I think the download
> time should is about 50s.
>
>
> Thansk,
>
> LiuLei
>
>
>
>
>
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Posted by Jitendra Yadav <je...@gmail.com>.
Hi,
Looks like you got some pace, did you also tried with compression
parameter? I think you will get more optimization with it. Also file
transfer speed depends on our network bandwidth between PNN/SNN and network
traffic b/w nodes.What's your network conf?
Thanks
On Wed, Aug 14, 2013 at 11:39 AM, lei liu <li...@gmail.com> wrote:
> I set the "dfs.image.transfer.bandwidthPerSec." to 50M, and the
> performance is below:
>
> 2013-08-14 12:32:33,079 INFO my.EditLogPerformance: totalCount:1342440
> speed:1111
> 2013-08-14 12:32:43,082 INFO my.EditLogPerformance: totalCount:1363338
> speed:1044
> 2013-08-14 12:32:53,085 INFO my.EditLogPerformance: totalCount:1385526
> speed:1109
> *2013-08-14 12:33:03,087 INFO my.EditLogPerformance: totalCount:1396324
> speed:539*
> *2013-08-14 12:33:13,090 INFO my.EditLogPerformance: totalCount:1406232
> speed:495
> 2013-08-14 12:33:23,093 INFO my.EditLogPerformance: totalCount:1415006
> speed:438
> 2013-08-14 12:33:33,096 INFO my.EditLogPerformance: totalCount:1423952
> speed:447*
> *2013-08-14 12:33:43,099 INFO my.EditLogPerformance: totalCount:1437256
> speed:665*
> 2013-08-14 12:33:53,102 INFO my.EditLogPerformance: totalCount:1458378
> speed:1056
> 2013-08-14 12:34:03,106 INFO my.EditLogPerformance: totalCount:1479338
> speed:1048
> 2013-08-14 12:34:13,108 INFO my.EditLogPerformance: totalCount:1500400
> speed:1053
> 2013-08-14 12:34:23,111 INFO my.EditLogPerformance: totalCount:1521252
> speed:1042
> 2013-08-14 12:34:33,114 INFO my.EditLogPerformance: totalCount:1542286
> speed:1051
> 2013-08-14 12:34:43,117 INFO my.EditLogPerformance: totalCount:1562956
> speed:1033
> 2013-08-14 12:34:53,120 INFO my.EditLogPerformance: totalCount:1583804
> speed:1042
> 2013-08-14 12:35:03,123 INFO my.EditLogPerformance: totalCount:1606558
> speed:1137
> 2013-08-14 12:35:13,126 INFO my.EditLogPerformance: totalCount:1627980
> speed:1071
> 2013-08-14 12:35:23,129 INFO my.EditLogPerformance: totalCount:1650642
> speed:1133
> 2013-08-14 12:35:33,132 INFO my.EditLogPerformance: totalCount:1672806
> speed:1108
> 2013-08-14 12:35:43,134 INFO my.EditLogPerformance: totalCount:1693940
> speed:1056
> 2013-08-14 12:35:53,137 INFO my.EditLogPerformance: totalCount:1715430
> speed:1074
> 2013-08-14 12:36:03,140 INFO my.EditLogPerformance: totalCount:1737940
> speed:1125
> 2013-08-14 12:36:13,143 INFO my.EditLogPerformance: totalCount:1760094
> speed:1107
> 2013-08-14 12:36:23,146 INFO my.EditLogPerformance: totalCount:1781646
> speed:1077
> 2013-08-14 12:36:33,149 INFO my.EditLogPerformance: totalCount:1802230
> speed:1029
> 2013-08-14 12:36:43,152 INFO my.EditLogPerformance: totalCount:1824132
> speed:1095
> 2013-08-14 12:36:53,155 INFO my.EditLogPerformance: totalCount:1846778
> speed:1132
> 2013-08-14 12:37:03,158 INFO my.EditLogPerformance: totalCount:1868956
> speed:1108
> 2013-08-14 12:37:13,161 INFO my.EditLogPerformance: totalCount:1888556
> speed:980
> 2013-08-14 12:37:23,164 INFO my.EditLogPerformance: totalCount:1910512
> speed:1097
> 2013-08-14 12:37:33,167 INFO my.EditLogPerformance: totalCount:1932240
> speed:1086
> 2013-08-14 12:37:43,170 INFO my.EditLogPerformance: totalCount:1954226
> speed:1099
> 2013-08-14 12:37:53,173 INFO my.EditLogPerformance: totalCount:1974706
> speed:1024
> 2013-08-14 12:38:03,176 INFO my.EditLogPerformance: totalCount:1993906
> speed:960
> 2013-08-14 12:38:13,179 INFO my.EditLogPerformance: totalCount:2014172
> speed:1013
> 2013-08-14 12:38:23,182 INFO my.EditLogPerformance: totalCount:2036130
> speed:1097
> 2013-08-14 12:38:33,184 INFO my.EditLogPerformance: totalCount:2057848
> speed:1085
> 2013-08-14 12:38:43,187 INFO my.EditLogPerformance: totalCount:2078834
> speed:1049
> 2013-08-14 12:38:53,190 INFO my.EditLogPerformance: totalCount:2095616
> speed:839
> *2013-08-14 12:39:03,193 INFO my.EditLogPerformance: totalCount:2104896
> speed:464
> 2013-08-14 12:39:13,196 INFO my.EditLogPerformance: totalCount:2114572
> speed:483
> 2013-08-14 12:39:23,199 INFO my.EditLogPerformance: totalCount:2123512
> speed:447*
> *2013-08-14 12:39:33,202 INFO my.EditLogPerformance: totalCount:2133604
> speed:504*
> 2013-08-14 12:39:43,205 INFO my.EditLogPerformance: totalCount:2149792
> speed:809
>
>
>
> The there are below info in Active NameNode:
> 2013-08-14 12:44:47,301 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening connection
> to
> http://dw78.kgb.sqa.cm4:20021/getimage?getimage=1&txid=655178418&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
> 2013-08-14 12:48:57,529 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: *Transfer took
> 250.23s at 10280.59 KB/s*
> 2013-08-14 12:48:57,530 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Downloaded file
> fsimage.ckpt_0000000000655178418 size 2634222089 bytes
>
>
> The there are below info in StadnbyNameNode:
> 2013-08-14 12:43:57,924 INFO
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer: Triggering
> checkpoint because there have been 2421083 txns since the last checkpoint,
> which exceeds the configured threshold 1000000
> 2013-08-14 12:43:57,925 INFO
> org.apache.hadoop.hdfs.server.namenode.FSImage: Saving image file
> /home/musa.ll/hadoop2/cluster-data/name/current/fsimage.ckpt_0000000000655178418
> using no compression
> 2013-08-14 12:48:58,044 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
> 250.75s at 0.00 KB/s
> 2013-08-14 12:48:58,045 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Uploaded image with
> txid 655178418 to namenode at 10.232.98.77:20021
>
>
>
> *When Active NameNode download the fsimage, the OPS is less than 500, the
> OPS dropped by 50%.*
>
> The "checkpoint.png" attachment is monitoring info in ganglia.
>
>
> I have two questions:
>
> - *first question:* The Active NameNode's bandwidth is occupied by
> 50%, Active NameNode still has 50M bandwidth to be used to handle
> RPC request, why the OPS dropped by 50%?
>
>
> - *second question:* The fsimage file is 2634222089 bytes, Active
> NameNode spend 240s to download the file. Yhe
> "dfs.image.transfer.bandwidthPerSec" value is 50M, I think the download
> time should is about 50s.
>
>
> Thansk,
>
> LiuLei
>
>
>
>
>
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Posted by Jitendra Yadav <je...@gmail.com>.
Hi,
Looks like you got some pace, did you also tried with compression
parameter? I think you will get more optimization with it. Also file
transfer speed depends on our network bandwidth between PNN/SNN and network
traffic b/w nodes.What's your network conf?
Thanks
On Wed, Aug 14, 2013 at 11:39 AM, lei liu <li...@gmail.com> wrote:
> I set the "dfs.image.transfer.bandwidthPerSec." to 50M, and the
> performance is below:
>
> 2013-08-14 12:32:33,079 INFO my.EditLogPerformance: totalCount:1342440
> speed:1111
> 2013-08-14 12:32:43,082 INFO my.EditLogPerformance: totalCount:1363338
> speed:1044
> 2013-08-14 12:32:53,085 INFO my.EditLogPerformance: totalCount:1385526
> speed:1109
> *2013-08-14 12:33:03,087 INFO my.EditLogPerformance: totalCount:1396324
> speed:539*
> *2013-08-14 12:33:13,090 INFO my.EditLogPerformance: totalCount:1406232
> speed:495
> 2013-08-14 12:33:23,093 INFO my.EditLogPerformance: totalCount:1415006
> speed:438
> 2013-08-14 12:33:33,096 INFO my.EditLogPerformance: totalCount:1423952
> speed:447*
> *2013-08-14 12:33:43,099 INFO my.EditLogPerformance: totalCount:1437256
> speed:665*
> 2013-08-14 12:33:53,102 INFO my.EditLogPerformance: totalCount:1458378
> speed:1056
> 2013-08-14 12:34:03,106 INFO my.EditLogPerformance: totalCount:1479338
> speed:1048
> 2013-08-14 12:34:13,108 INFO my.EditLogPerformance: totalCount:1500400
> speed:1053
> 2013-08-14 12:34:23,111 INFO my.EditLogPerformance: totalCount:1521252
> speed:1042
> 2013-08-14 12:34:33,114 INFO my.EditLogPerformance: totalCount:1542286
> speed:1051
> 2013-08-14 12:34:43,117 INFO my.EditLogPerformance: totalCount:1562956
> speed:1033
> 2013-08-14 12:34:53,120 INFO my.EditLogPerformance: totalCount:1583804
> speed:1042
> 2013-08-14 12:35:03,123 INFO my.EditLogPerformance: totalCount:1606558
> speed:1137
> 2013-08-14 12:35:13,126 INFO my.EditLogPerformance: totalCount:1627980
> speed:1071
> 2013-08-14 12:35:23,129 INFO my.EditLogPerformance: totalCount:1650642
> speed:1133
> 2013-08-14 12:35:33,132 INFO my.EditLogPerformance: totalCount:1672806
> speed:1108
> 2013-08-14 12:35:43,134 INFO my.EditLogPerformance: totalCount:1693940
> speed:1056
> 2013-08-14 12:35:53,137 INFO my.EditLogPerformance: totalCount:1715430
> speed:1074
> 2013-08-14 12:36:03,140 INFO my.EditLogPerformance: totalCount:1737940
> speed:1125
> 2013-08-14 12:36:13,143 INFO my.EditLogPerformance: totalCount:1760094
> speed:1107
> 2013-08-14 12:36:23,146 INFO my.EditLogPerformance: totalCount:1781646
> speed:1077
> 2013-08-14 12:36:33,149 INFO my.EditLogPerformance: totalCount:1802230
> speed:1029
> 2013-08-14 12:36:43,152 INFO my.EditLogPerformance: totalCount:1824132
> speed:1095
> 2013-08-14 12:36:53,155 INFO my.EditLogPerformance: totalCount:1846778
> speed:1132
> 2013-08-14 12:37:03,158 INFO my.EditLogPerformance: totalCount:1868956
> speed:1108
> 2013-08-14 12:37:13,161 INFO my.EditLogPerformance: totalCount:1888556
> speed:980
> 2013-08-14 12:37:23,164 INFO my.EditLogPerformance: totalCount:1910512
> speed:1097
> 2013-08-14 12:37:33,167 INFO my.EditLogPerformance: totalCount:1932240
> speed:1086
> 2013-08-14 12:37:43,170 INFO my.EditLogPerformance: totalCount:1954226
> speed:1099
> 2013-08-14 12:37:53,173 INFO my.EditLogPerformance: totalCount:1974706
> speed:1024
> 2013-08-14 12:38:03,176 INFO my.EditLogPerformance: totalCount:1993906
> speed:960
> 2013-08-14 12:38:13,179 INFO my.EditLogPerformance: totalCount:2014172
> speed:1013
> 2013-08-14 12:38:23,182 INFO my.EditLogPerformance: totalCount:2036130
> speed:1097
> 2013-08-14 12:38:33,184 INFO my.EditLogPerformance: totalCount:2057848
> speed:1085
> 2013-08-14 12:38:43,187 INFO my.EditLogPerformance: totalCount:2078834
> speed:1049
> 2013-08-14 12:38:53,190 INFO my.EditLogPerformance: totalCount:2095616
> speed:839
> *2013-08-14 12:39:03,193 INFO my.EditLogPerformance: totalCount:2104896
> speed:464
> 2013-08-14 12:39:13,196 INFO my.EditLogPerformance: totalCount:2114572
> speed:483
> 2013-08-14 12:39:23,199 INFO my.EditLogPerformance: totalCount:2123512
> speed:447*
> *2013-08-14 12:39:33,202 INFO my.EditLogPerformance: totalCount:2133604
> speed:504*
> 2013-08-14 12:39:43,205 INFO my.EditLogPerformance: totalCount:2149792
> speed:809
>
>
>
> The there are below info in Active NameNode:
> 2013-08-14 12:44:47,301 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening connection
> to
> http://dw78.kgb.sqa.cm4:20021/getimage?getimage=1&txid=655178418&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
> 2013-08-14 12:48:57,529 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: *Transfer took
> 250.23s at 10280.59 KB/s*
> 2013-08-14 12:48:57,530 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Downloaded file
> fsimage.ckpt_0000000000655178418 size 2634222089 bytes
>
>
> The there are below info in StadnbyNameNode:
> 2013-08-14 12:43:57,924 INFO
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer: Triggering
> checkpoint because there have been 2421083 txns since the last checkpoint,
> which exceeds the configured threshold 1000000
> 2013-08-14 12:43:57,925 INFO
> org.apache.hadoop.hdfs.server.namenode.FSImage: Saving image file
> /home/musa.ll/hadoop2/cluster-data/name/current/fsimage.ckpt_0000000000655178418
> using no compression
> 2013-08-14 12:48:58,044 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
> 250.75s at 0.00 KB/s
> 2013-08-14 12:48:58,045 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Uploaded image with
> txid 655178418 to namenode at 10.232.98.77:20021
>
>
>
> *When Active NameNode download the fsimage, the OPS is less than 500, the
> OPS dropped by 50%.*
>
> The "checkpoint.png" attachment is monitoring info in ganglia.
>
>
> I have two questions:
>
> - *first question:* The Active NameNode's bandwidth is occupied by
> 50%, Active NameNode still has 50M bandwidth to be used to handle
> RPC request, why the OPS dropped by 50%?
>
>
> - *second question:* The fsimage file is 2634222089 bytes, Active
> NameNode spend 240s to download the file. Yhe
> "dfs.image.transfer.bandwidthPerSec" value is 50M, I think the download
> time should is about 50s.
>
>
> Thansk,
>
> LiuLei
>
>
>
>
>
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Posted by Jitendra Yadav <je...@gmail.com>.
Hi,
Looks like you got some pace, did you also tried with compression
parameter? I think you will get more optimization with it. Also file
transfer speed depends on our network bandwidth between PNN/SNN and network
traffic b/w nodes.What's your network conf?
Thanks
On Wed, Aug 14, 2013 at 11:39 AM, lei liu <li...@gmail.com> wrote:
> I set the "dfs.image.transfer.bandwidthPerSec." to 50M, and the
> performance is below:
>
> 2013-08-14 12:32:33,079 INFO my.EditLogPerformance: totalCount:1342440
> speed:1111
> 2013-08-14 12:32:43,082 INFO my.EditLogPerformance: totalCount:1363338
> speed:1044
> 2013-08-14 12:32:53,085 INFO my.EditLogPerformance: totalCount:1385526
> speed:1109
> *2013-08-14 12:33:03,087 INFO my.EditLogPerformance: totalCount:1396324
> speed:539*
> *2013-08-14 12:33:13,090 INFO my.EditLogPerformance: totalCount:1406232
> speed:495
> 2013-08-14 12:33:23,093 INFO my.EditLogPerformance: totalCount:1415006
> speed:438
> 2013-08-14 12:33:33,096 INFO my.EditLogPerformance: totalCount:1423952
> speed:447*
> *2013-08-14 12:33:43,099 INFO my.EditLogPerformance: totalCount:1437256
> speed:665*
> 2013-08-14 12:33:53,102 INFO my.EditLogPerformance: totalCount:1458378
> speed:1056
> 2013-08-14 12:34:03,106 INFO my.EditLogPerformance: totalCount:1479338
> speed:1048
> 2013-08-14 12:34:13,108 INFO my.EditLogPerformance: totalCount:1500400
> speed:1053
> 2013-08-14 12:34:23,111 INFO my.EditLogPerformance: totalCount:1521252
> speed:1042
> 2013-08-14 12:34:33,114 INFO my.EditLogPerformance: totalCount:1542286
> speed:1051
> 2013-08-14 12:34:43,117 INFO my.EditLogPerformance: totalCount:1562956
> speed:1033
> 2013-08-14 12:34:53,120 INFO my.EditLogPerformance: totalCount:1583804
> speed:1042
> 2013-08-14 12:35:03,123 INFO my.EditLogPerformance: totalCount:1606558
> speed:1137
> 2013-08-14 12:35:13,126 INFO my.EditLogPerformance: totalCount:1627980
> speed:1071
> 2013-08-14 12:35:23,129 INFO my.EditLogPerformance: totalCount:1650642
> speed:1133
> 2013-08-14 12:35:33,132 INFO my.EditLogPerformance: totalCount:1672806
> speed:1108
> 2013-08-14 12:35:43,134 INFO my.EditLogPerformance: totalCount:1693940
> speed:1056
> 2013-08-14 12:35:53,137 INFO my.EditLogPerformance: totalCount:1715430
> speed:1074
> 2013-08-14 12:36:03,140 INFO my.EditLogPerformance: totalCount:1737940
> speed:1125
> 2013-08-14 12:36:13,143 INFO my.EditLogPerformance: totalCount:1760094
> speed:1107
> 2013-08-14 12:36:23,146 INFO my.EditLogPerformance: totalCount:1781646
> speed:1077
> 2013-08-14 12:36:33,149 INFO my.EditLogPerformance: totalCount:1802230
> speed:1029
> 2013-08-14 12:36:43,152 INFO my.EditLogPerformance: totalCount:1824132
> speed:1095
> 2013-08-14 12:36:53,155 INFO my.EditLogPerformance: totalCount:1846778
> speed:1132
> 2013-08-14 12:37:03,158 INFO my.EditLogPerformance: totalCount:1868956
> speed:1108
> 2013-08-14 12:37:13,161 INFO my.EditLogPerformance: totalCount:1888556
> speed:980
> 2013-08-14 12:37:23,164 INFO my.EditLogPerformance: totalCount:1910512
> speed:1097
> 2013-08-14 12:37:33,167 INFO my.EditLogPerformance: totalCount:1932240
> speed:1086
> 2013-08-14 12:37:43,170 INFO my.EditLogPerformance: totalCount:1954226
> speed:1099
> 2013-08-14 12:37:53,173 INFO my.EditLogPerformance: totalCount:1974706
> speed:1024
> 2013-08-14 12:38:03,176 INFO my.EditLogPerformance: totalCount:1993906
> speed:960
> 2013-08-14 12:38:13,179 INFO my.EditLogPerformance: totalCount:2014172
> speed:1013
> 2013-08-14 12:38:23,182 INFO my.EditLogPerformance: totalCount:2036130
> speed:1097
> 2013-08-14 12:38:33,184 INFO my.EditLogPerformance: totalCount:2057848
> speed:1085
> 2013-08-14 12:38:43,187 INFO my.EditLogPerformance: totalCount:2078834
> speed:1049
> 2013-08-14 12:38:53,190 INFO my.EditLogPerformance: totalCount:2095616
> speed:839
> *2013-08-14 12:39:03,193 INFO my.EditLogPerformance: totalCount:2104896
> speed:464
> 2013-08-14 12:39:13,196 INFO my.EditLogPerformance: totalCount:2114572
> speed:483
> 2013-08-14 12:39:23,199 INFO my.EditLogPerformance: totalCount:2123512
> speed:447*
> *2013-08-14 12:39:33,202 INFO my.EditLogPerformance: totalCount:2133604
> speed:504*
> 2013-08-14 12:39:43,205 INFO my.EditLogPerformance: totalCount:2149792
> speed:809
>
>
>
> The there are below info in Active NameNode:
> 2013-08-14 12:44:47,301 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening connection
> to
> http://dw78.kgb.sqa.cm4:20021/getimage?getimage=1&txid=655178418&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
> 2013-08-14 12:48:57,529 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: *Transfer took
> 250.23s at 10280.59 KB/s*
> 2013-08-14 12:48:57,530 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Downloaded file
> fsimage.ckpt_0000000000655178418 size 2634222089 bytes
>
>
> The there are below info in StadnbyNameNode:
> 2013-08-14 12:43:57,924 INFO
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer: Triggering
> checkpoint because there have been 2421083 txns since the last checkpoint,
> which exceeds the configured threshold 1000000
> 2013-08-14 12:43:57,925 INFO
> org.apache.hadoop.hdfs.server.namenode.FSImage: Saving image file
> /home/musa.ll/hadoop2/cluster-data/name/current/fsimage.ckpt_0000000000655178418
> using no compression
> 2013-08-14 12:48:58,044 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
> 250.75s at 0.00 KB/s
> 2013-08-14 12:48:58,045 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Uploaded image with
> txid 655178418 to namenode at 10.232.98.77:20021
>
>
>
> *When Active NameNode download the fsimage, the OPS is less than 500, the
> OPS dropped by 50%.*
>
> The "checkpoint.png" attachment is monitoring info in ganglia.
>
>
> I have two questions:
>
> - *first question:* The Active NameNode's bandwidth is occupied by
> 50%, Active NameNode still has 50M bandwidth to be used to handle
> RPC request, why the OPS dropped by 50%?
>
>
> - *second question:* The fsimage file is 2634222089 bytes, Active
> NameNode spend 240s to download the file. Yhe
> "dfs.image.transfer.bandwidthPerSec" value is 50M, I think the download
> time should is about 50s.
>
>
> Thansk,
>
> LiuLei
>
>
>
>
>
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Posted by lei liu <li...@gmail.com>.
I set the "dfs.image.transfer.bandwidthPerSec." to 50M, and the performance
is below:
2013-08-14 12:32:33,079 INFO my.EditLogPerformance: totalCount:1342440
speed:1111
2013-08-14 12:32:43,082 INFO my.EditLogPerformance: totalCount:1363338
speed:1044
2013-08-14 12:32:53,085 INFO my.EditLogPerformance: totalCount:1385526
speed:1109
*2013-08-14 12:33:03,087 INFO my.EditLogPerformance: totalCount:1396324
speed:539*
*2013-08-14 12:33:13,090 INFO my.EditLogPerformance: totalCount:1406232
speed:495
2013-08-14 12:33:23,093 INFO my.EditLogPerformance: totalCount:1415006
speed:438
2013-08-14 12:33:33,096 INFO my.EditLogPerformance: totalCount:1423952
speed:447*
*2013-08-14 12:33:43,099 INFO my.EditLogPerformance: totalCount:1437256
speed:665*
2013-08-14 12:33:53,102 INFO my.EditLogPerformance: totalCount:1458378
speed:1056
2013-08-14 12:34:03,106 INFO my.EditLogPerformance: totalCount:1479338
speed:1048
2013-08-14 12:34:13,108 INFO my.EditLogPerformance: totalCount:1500400
speed:1053
2013-08-14 12:34:23,111 INFO my.EditLogPerformance: totalCount:1521252
speed:1042
2013-08-14 12:34:33,114 INFO my.EditLogPerformance: totalCount:1542286
speed:1051
2013-08-14 12:34:43,117 INFO my.EditLogPerformance: totalCount:1562956
speed:1033
2013-08-14 12:34:53,120 INFO my.EditLogPerformance: totalCount:1583804
speed:1042
2013-08-14 12:35:03,123 INFO my.EditLogPerformance: totalCount:1606558
speed:1137
2013-08-14 12:35:13,126 INFO my.EditLogPerformance: totalCount:1627980
speed:1071
2013-08-14 12:35:23,129 INFO my.EditLogPerformance: totalCount:1650642
speed:1133
2013-08-14 12:35:33,132 INFO my.EditLogPerformance: totalCount:1672806
speed:1108
2013-08-14 12:35:43,134 INFO my.EditLogPerformance: totalCount:1693940
speed:1056
2013-08-14 12:35:53,137 INFO my.EditLogPerformance: totalCount:1715430
speed:1074
2013-08-14 12:36:03,140 INFO my.EditLogPerformance: totalCount:1737940
speed:1125
2013-08-14 12:36:13,143 INFO my.EditLogPerformance: totalCount:1760094
speed:1107
2013-08-14 12:36:23,146 INFO my.EditLogPerformance: totalCount:1781646
speed:1077
2013-08-14 12:36:33,149 INFO my.EditLogPerformance: totalCount:1802230
speed:1029
2013-08-14 12:36:43,152 INFO my.EditLogPerformance: totalCount:1824132
speed:1095
2013-08-14 12:36:53,155 INFO my.EditLogPerformance: totalCount:1846778
speed:1132
2013-08-14 12:37:03,158 INFO my.EditLogPerformance: totalCount:1868956
speed:1108
2013-08-14 12:37:13,161 INFO my.EditLogPerformance: totalCount:1888556
speed:980
2013-08-14 12:37:23,164 INFO my.EditLogPerformance: totalCount:1910512
speed:1097
2013-08-14 12:37:33,167 INFO my.EditLogPerformance: totalCount:1932240
speed:1086
2013-08-14 12:37:43,170 INFO my.EditLogPerformance: totalCount:1954226
speed:1099
2013-08-14 12:37:53,173 INFO my.EditLogPerformance: totalCount:1974706
speed:1024
2013-08-14 12:38:03,176 INFO my.EditLogPerformance: totalCount:1993906
speed:960
2013-08-14 12:38:13,179 INFO my.EditLogPerformance: totalCount:2014172
speed:1013
2013-08-14 12:38:23,182 INFO my.EditLogPerformance: totalCount:2036130
speed:1097
2013-08-14 12:38:33,184 INFO my.EditLogPerformance: totalCount:2057848
speed:1085
2013-08-14 12:38:43,187 INFO my.EditLogPerformance: totalCount:2078834
speed:1049
2013-08-14 12:38:53,190 INFO my.EditLogPerformance: totalCount:2095616
speed:839
*2013-08-14 12:39:03,193 INFO my.EditLogPerformance: totalCount:2104896
speed:464
2013-08-14 12:39:13,196 INFO my.EditLogPerformance: totalCount:2114572
speed:483
2013-08-14 12:39:23,199 INFO my.EditLogPerformance: totalCount:2123512
speed:447*
*2013-08-14 12:39:33,202 INFO my.EditLogPerformance: totalCount:2133604
speed:504*
2013-08-14 12:39:43,205 INFO my.EditLogPerformance: totalCount:2149792
speed:809
The there are below info in Active NameNode:
2013-08-14 12:44:47,301 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening connection
to
http://dw78.kgb.sqa.cm4:20021/getimage?getimage=1&txid=655178418&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
2013-08-14 12:48:57,529 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: *Transfer took
250.23s at 10280.59 KB/s*
2013-08-14 12:48:57,530 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Downloaded file
fsimage.ckpt_0000000000655178418 size 2634222089 bytes
The there are below info in StadnbyNameNode:
2013-08-14 12:43:57,924 INFO
org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer: Triggering
checkpoint because there have been 2421083 txns since the last checkpoint,
which exceeds the configured threshold 1000000
2013-08-14 12:43:57,925 INFO
org.apache.hadoop.hdfs.server.namenode.FSImage: Saving image file
/home/musa.ll/hadoop2/cluster-data/name/current/fsimage.ckpt_0000000000655178418
using no compression
2013-08-14 12:48:58,044 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
250.75s at 0.00 KB/s
2013-08-14 12:48:58,045 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Uploaded image with
txid 655178418 to namenode at 10.232.98.77:20021
*When Active NameNode download the fsimage, the OPS is less than 500, the
OPS dropped by 50%.*
The "checkpoint.png" attachment is monitoring info in ganglia.
I have two questions:
- *first question:* The Active NameNode's bandwidth is occupied by
50%, Active NameNode still has 50M bandwidth to be used to handle RPC
request, why the OPS dropped by 50%?
- *second question:* The fsimage file is 2634222089 bytes, Active
NameNode spend 240s to download the file. Yhe
"dfs.image.transfer.bandwidthPerSec" value is 50M, I think the download
time should is about 50s.
Thansk,
LiuLei
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Posted by lei liu <li...@gmail.com>.
I set the "dfs.image.transfer.bandwidthPerSec." to 50M, and the performance
is below:
2013-08-14 12:32:33,079 INFO my.EditLogPerformance: totalCount:1342440
speed:1111
2013-08-14 12:32:43,082 INFO my.EditLogPerformance: totalCount:1363338
speed:1044
2013-08-14 12:32:53,085 INFO my.EditLogPerformance: totalCount:1385526
speed:1109
*2013-08-14 12:33:03,087 INFO my.EditLogPerformance: totalCount:1396324
speed:539*
*2013-08-14 12:33:13,090 INFO my.EditLogPerformance: totalCount:1406232
speed:495
2013-08-14 12:33:23,093 INFO my.EditLogPerformance: totalCount:1415006
speed:438
2013-08-14 12:33:33,096 INFO my.EditLogPerformance: totalCount:1423952
speed:447*
*2013-08-14 12:33:43,099 INFO my.EditLogPerformance: totalCount:1437256
speed:665*
2013-08-14 12:33:53,102 INFO my.EditLogPerformance: totalCount:1458378
speed:1056
2013-08-14 12:34:03,106 INFO my.EditLogPerformance: totalCount:1479338
speed:1048
2013-08-14 12:34:13,108 INFO my.EditLogPerformance: totalCount:1500400
speed:1053
2013-08-14 12:34:23,111 INFO my.EditLogPerformance: totalCount:1521252
speed:1042
2013-08-14 12:34:33,114 INFO my.EditLogPerformance: totalCount:1542286
speed:1051
2013-08-14 12:34:43,117 INFO my.EditLogPerformance: totalCount:1562956
speed:1033
2013-08-14 12:34:53,120 INFO my.EditLogPerformance: totalCount:1583804
speed:1042
2013-08-14 12:35:03,123 INFO my.EditLogPerformance: totalCount:1606558
speed:1137
2013-08-14 12:35:13,126 INFO my.EditLogPerformance: totalCount:1627980
speed:1071
2013-08-14 12:35:23,129 INFO my.EditLogPerformance: totalCount:1650642
speed:1133
2013-08-14 12:35:33,132 INFO my.EditLogPerformance: totalCount:1672806
speed:1108
2013-08-14 12:35:43,134 INFO my.EditLogPerformance: totalCount:1693940
speed:1056
2013-08-14 12:35:53,137 INFO my.EditLogPerformance: totalCount:1715430
speed:1074
2013-08-14 12:36:03,140 INFO my.EditLogPerformance: totalCount:1737940
speed:1125
2013-08-14 12:36:13,143 INFO my.EditLogPerformance: totalCount:1760094
speed:1107
2013-08-14 12:36:23,146 INFO my.EditLogPerformance: totalCount:1781646
speed:1077
2013-08-14 12:36:33,149 INFO my.EditLogPerformance: totalCount:1802230
speed:1029
2013-08-14 12:36:43,152 INFO my.EditLogPerformance: totalCount:1824132
speed:1095
2013-08-14 12:36:53,155 INFO my.EditLogPerformance: totalCount:1846778
speed:1132
2013-08-14 12:37:03,158 INFO my.EditLogPerformance: totalCount:1868956
speed:1108
2013-08-14 12:37:13,161 INFO my.EditLogPerformance: totalCount:1888556
speed:980
2013-08-14 12:37:23,164 INFO my.EditLogPerformance: totalCount:1910512
speed:1097
2013-08-14 12:37:33,167 INFO my.EditLogPerformance: totalCount:1932240
speed:1086
2013-08-14 12:37:43,170 INFO my.EditLogPerformance: totalCount:1954226
speed:1099
2013-08-14 12:37:53,173 INFO my.EditLogPerformance: totalCount:1974706
speed:1024
2013-08-14 12:38:03,176 INFO my.EditLogPerformance: totalCount:1993906
speed:960
2013-08-14 12:38:13,179 INFO my.EditLogPerformance: totalCount:2014172
speed:1013
2013-08-14 12:38:23,182 INFO my.EditLogPerformance: totalCount:2036130
speed:1097
2013-08-14 12:38:33,184 INFO my.EditLogPerformance: totalCount:2057848
speed:1085
2013-08-14 12:38:43,187 INFO my.EditLogPerformance: totalCount:2078834
speed:1049
2013-08-14 12:38:53,190 INFO my.EditLogPerformance: totalCount:2095616
speed:839
*2013-08-14 12:39:03,193 INFO my.EditLogPerformance: totalCount:2104896
speed:464
2013-08-14 12:39:13,196 INFO my.EditLogPerformance: totalCount:2114572
speed:483
2013-08-14 12:39:23,199 INFO my.EditLogPerformance: totalCount:2123512
speed:447*
*2013-08-14 12:39:33,202 INFO my.EditLogPerformance: totalCount:2133604
speed:504*
2013-08-14 12:39:43,205 INFO my.EditLogPerformance: totalCount:2149792
speed:809
The there are below info in Active NameNode:
2013-08-14 12:44:47,301 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening connection
to
http://dw78.kgb.sqa.cm4:20021/getimage?getimage=1&txid=655178418&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
2013-08-14 12:48:57,529 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: *Transfer took
250.23s at 10280.59 KB/s*
2013-08-14 12:48:57,530 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Downloaded file
fsimage.ckpt_0000000000655178418 size 2634222089 bytes
The there are below info in StadnbyNameNode:
2013-08-14 12:43:57,924 INFO
org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer: Triggering
checkpoint because there have been 2421083 txns since the last checkpoint,
which exceeds the configured threshold 1000000
2013-08-14 12:43:57,925 INFO
org.apache.hadoop.hdfs.server.namenode.FSImage: Saving image file
/home/musa.ll/hadoop2/cluster-data/name/current/fsimage.ckpt_0000000000655178418
using no compression
2013-08-14 12:48:58,044 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
250.75s at 0.00 KB/s
2013-08-14 12:48:58,045 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Uploaded image with
txid 655178418 to namenode at 10.232.98.77:20021
*When Active NameNode download the fsimage, the OPS is less than 500, the
OPS dropped by 50%.*
The "checkpoint.png" attachment is monitoring info in ganglia.
I have two questions:
- *first question:* The Active NameNode's bandwidth is occupied by
50%, Active NameNode still has 50M bandwidth to be used to handle RPC
request, why the OPS dropped by 50%?
- *second question:* The fsimage file is 2634222089 bytes, Active
NameNode spend 240s to download the file. Yhe
"dfs.image.transfer.bandwidthPerSec" value is 50M, I think the download
time should is about 50s.
Thansk,
LiuLei
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Posted by lei liu <li...@gmail.com>.
I set the "dfs.image.transfer.bandwidthPerSec." to 50M, and the performance
is below:
2013-08-14 12:32:33,079 INFO my.EditLogPerformance: totalCount:1342440
speed:1111
2013-08-14 12:32:43,082 INFO my.EditLogPerformance: totalCount:1363338
speed:1044
2013-08-14 12:32:53,085 INFO my.EditLogPerformance: totalCount:1385526
speed:1109
*2013-08-14 12:33:03,087 INFO my.EditLogPerformance: totalCount:1396324
speed:539*
*2013-08-14 12:33:13,090 INFO my.EditLogPerformance: totalCount:1406232
speed:495
2013-08-14 12:33:23,093 INFO my.EditLogPerformance: totalCount:1415006
speed:438
2013-08-14 12:33:33,096 INFO my.EditLogPerformance: totalCount:1423952
speed:447*
*2013-08-14 12:33:43,099 INFO my.EditLogPerformance: totalCount:1437256
speed:665*
2013-08-14 12:33:53,102 INFO my.EditLogPerformance: totalCount:1458378
speed:1056
2013-08-14 12:34:03,106 INFO my.EditLogPerformance: totalCount:1479338
speed:1048
2013-08-14 12:34:13,108 INFO my.EditLogPerformance: totalCount:1500400
speed:1053
2013-08-14 12:34:23,111 INFO my.EditLogPerformance: totalCount:1521252
speed:1042
2013-08-14 12:34:33,114 INFO my.EditLogPerformance: totalCount:1542286
speed:1051
2013-08-14 12:34:43,117 INFO my.EditLogPerformance: totalCount:1562956
speed:1033
2013-08-14 12:34:53,120 INFO my.EditLogPerformance: totalCount:1583804
speed:1042
2013-08-14 12:35:03,123 INFO my.EditLogPerformance: totalCount:1606558
speed:1137
2013-08-14 12:35:13,126 INFO my.EditLogPerformance: totalCount:1627980
speed:1071
2013-08-14 12:35:23,129 INFO my.EditLogPerformance: totalCount:1650642
speed:1133
2013-08-14 12:35:33,132 INFO my.EditLogPerformance: totalCount:1672806
speed:1108
2013-08-14 12:35:43,134 INFO my.EditLogPerformance: totalCount:1693940
speed:1056
2013-08-14 12:35:53,137 INFO my.EditLogPerformance: totalCount:1715430
speed:1074
2013-08-14 12:36:03,140 INFO my.EditLogPerformance: totalCount:1737940
speed:1125
2013-08-14 12:36:13,143 INFO my.EditLogPerformance: totalCount:1760094
speed:1107
2013-08-14 12:36:23,146 INFO my.EditLogPerformance: totalCount:1781646
speed:1077
2013-08-14 12:36:33,149 INFO my.EditLogPerformance: totalCount:1802230
speed:1029
2013-08-14 12:36:43,152 INFO my.EditLogPerformance: totalCount:1824132
speed:1095
2013-08-14 12:36:53,155 INFO my.EditLogPerformance: totalCount:1846778
speed:1132
2013-08-14 12:37:03,158 INFO my.EditLogPerformance: totalCount:1868956
speed:1108
2013-08-14 12:37:13,161 INFO my.EditLogPerformance: totalCount:1888556
speed:980
2013-08-14 12:37:23,164 INFO my.EditLogPerformance: totalCount:1910512
speed:1097
2013-08-14 12:37:33,167 INFO my.EditLogPerformance: totalCount:1932240
speed:1086
2013-08-14 12:37:43,170 INFO my.EditLogPerformance: totalCount:1954226
speed:1099
2013-08-14 12:37:53,173 INFO my.EditLogPerformance: totalCount:1974706
speed:1024
2013-08-14 12:38:03,176 INFO my.EditLogPerformance: totalCount:1993906
speed:960
2013-08-14 12:38:13,179 INFO my.EditLogPerformance: totalCount:2014172
speed:1013
2013-08-14 12:38:23,182 INFO my.EditLogPerformance: totalCount:2036130
speed:1097
2013-08-14 12:38:33,184 INFO my.EditLogPerformance: totalCount:2057848
speed:1085
2013-08-14 12:38:43,187 INFO my.EditLogPerformance: totalCount:2078834
speed:1049
2013-08-14 12:38:53,190 INFO my.EditLogPerformance: totalCount:2095616
speed:839
*2013-08-14 12:39:03,193 INFO my.EditLogPerformance: totalCount:2104896
speed:464
2013-08-14 12:39:13,196 INFO my.EditLogPerformance: totalCount:2114572
speed:483
2013-08-14 12:39:23,199 INFO my.EditLogPerformance: totalCount:2123512
speed:447*
*2013-08-14 12:39:33,202 INFO my.EditLogPerformance: totalCount:2133604
speed:504*
2013-08-14 12:39:43,205 INFO my.EditLogPerformance: totalCount:2149792
speed:809
The there are below info in Active NameNode:
2013-08-14 12:44:47,301 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening connection
to
http://dw78.kgb.sqa.cm4:20021/getimage?getimage=1&txid=655178418&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
2013-08-14 12:48:57,529 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: *Transfer took
250.23s at 10280.59 KB/s*
2013-08-14 12:48:57,530 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Downloaded file
fsimage.ckpt_0000000000655178418 size 2634222089 bytes
The there are below info in StadnbyNameNode:
2013-08-14 12:43:57,924 INFO
org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer: Triggering
checkpoint because there have been 2421083 txns since the last checkpoint,
which exceeds the configured threshold 1000000
2013-08-14 12:43:57,925 INFO
org.apache.hadoop.hdfs.server.namenode.FSImage: Saving image file
/home/musa.ll/hadoop2/cluster-data/name/current/fsimage.ckpt_0000000000655178418
using no compression
2013-08-14 12:48:58,044 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
250.75s at 0.00 KB/s
2013-08-14 12:48:58,045 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Uploaded image with
txid 655178418 to namenode at 10.232.98.77:20021
*When Active NameNode download the fsimage, the OPS is less than 500, the
OPS dropped by 50%.*
The "checkpoint.png" attachment is monitoring info in ganglia.
I have two questions:
- *first question:* The Active NameNode's bandwidth is occupied by
50%, Active NameNode still has 50M bandwidth to be used to handle RPC
request, why the OPS dropped by 50%?
- *second question:* The fsimage file is 2634222089 bytes, Active
NameNode spend 240s to download the file. Yhe
"dfs.image.transfer.bandwidthPerSec" value is 50M, I think the download
time should is about 50s.
Thansk,
LiuLei
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Posted by lei liu <li...@gmail.com>.
I set the "dfs.image.transfer.bandwidthPerSec." to 50M, and the performance
is below:
2013-08-14 12:32:33,079 INFO my.EditLogPerformance: totalCount:1342440
speed:1111
2013-08-14 12:32:43,082 INFO my.EditLogPerformance: totalCount:1363338
speed:1044
2013-08-14 12:32:53,085 INFO my.EditLogPerformance: totalCount:1385526
speed:1109
*2013-08-14 12:33:03,087 INFO my.EditLogPerformance: totalCount:1396324
speed:539*
*2013-08-14 12:33:13,090 INFO my.EditLogPerformance: totalCount:1406232
speed:495
2013-08-14 12:33:23,093 INFO my.EditLogPerformance: totalCount:1415006
speed:438
2013-08-14 12:33:33,096 INFO my.EditLogPerformance: totalCount:1423952
speed:447*
*2013-08-14 12:33:43,099 INFO my.EditLogPerformance: totalCount:1437256
speed:665*
2013-08-14 12:33:53,102 INFO my.EditLogPerformance: totalCount:1458378
speed:1056
2013-08-14 12:34:03,106 INFO my.EditLogPerformance: totalCount:1479338
speed:1048
2013-08-14 12:34:13,108 INFO my.EditLogPerformance: totalCount:1500400
speed:1053
2013-08-14 12:34:23,111 INFO my.EditLogPerformance: totalCount:1521252
speed:1042
2013-08-14 12:34:33,114 INFO my.EditLogPerformance: totalCount:1542286
speed:1051
2013-08-14 12:34:43,117 INFO my.EditLogPerformance: totalCount:1562956
speed:1033
2013-08-14 12:34:53,120 INFO my.EditLogPerformance: totalCount:1583804
speed:1042
2013-08-14 12:35:03,123 INFO my.EditLogPerformance: totalCount:1606558
speed:1137
2013-08-14 12:35:13,126 INFO my.EditLogPerformance: totalCount:1627980
speed:1071
2013-08-14 12:35:23,129 INFO my.EditLogPerformance: totalCount:1650642
speed:1133
2013-08-14 12:35:33,132 INFO my.EditLogPerformance: totalCount:1672806
speed:1108
2013-08-14 12:35:43,134 INFO my.EditLogPerformance: totalCount:1693940
speed:1056
2013-08-14 12:35:53,137 INFO my.EditLogPerformance: totalCount:1715430
speed:1074
2013-08-14 12:36:03,140 INFO my.EditLogPerformance: totalCount:1737940
speed:1125
2013-08-14 12:36:13,143 INFO my.EditLogPerformance: totalCount:1760094
speed:1107
2013-08-14 12:36:23,146 INFO my.EditLogPerformance: totalCount:1781646
speed:1077
2013-08-14 12:36:33,149 INFO my.EditLogPerformance: totalCount:1802230
speed:1029
2013-08-14 12:36:43,152 INFO my.EditLogPerformance: totalCount:1824132
speed:1095
2013-08-14 12:36:53,155 INFO my.EditLogPerformance: totalCount:1846778
speed:1132
2013-08-14 12:37:03,158 INFO my.EditLogPerformance: totalCount:1868956
speed:1108
2013-08-14 12:37:13,161 INFO my.EditLogPerformance: totalCount:1888556
speed:980
2013-08-14 12:37:23,164 INFO my.EditLogPerformance: totalCount:1910512
speed:1097
2013-08-14 12:37:33,167 INFO my.EditLogPerformance: totalCount:1932240
speed:1086
2013-08-14 12:37:43,170 INFO my.EditLogPerformance: totalCount:1954226
speed:1099
2013-08-14 12:37:53,173 INFO my.EditLogPerformance: totalCount:1974706
speed:1024
2013-08-14 12:38:03,176 INFO my.EditLogPerformance: totalCount:1993906
speed:960
2013-08-14 12:38:13,179 INFO my.EditLogPerformance: totalCount:2014172
speed:1013
2013-08-14 12:38:23,182 INFO my.EditLogPerformance: totalCount:2036130
speed:1097
2013-08-14 12:38:33,184 INFO my.EditLogPerformance: totalCount:2057848
speed:1085
2013-08-14 12:38:43,187 INFO my.EditLogPerformance: totalCount:2078834
speed:1049
2013-08-14 12:38:53,190 INFO my.EditLogPerformance: totalCount:2095616
speed:839
*2013-08-14 12:39:03,193 INFO my.EditLogPerformance: totalCount:2104896
speed:464
2013-08-14 12:39:13,196 INFO my.EditLogPerformance: totalCount:2114572
speed:483
2013-08-14 12:39:23,199 INFO my.EditLogPerformance: totalCount:2123512
speed:447*
*2013-08-14 12:39:33,202 INFO my.EditLogPerformance: totalCount:2133604
speed:504*
2013-08-14 12:39:43,205 INFO my.EditLogPerformance: totalCount:2149792
speed:809
The there are below info in Active NameNode:
2013-08-14 12:44:47,301 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening connection
to
http://dw78.kgb.sqa.cm4:20021/getimage?getimage=1&txid=655178418&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
2013-08-14 12:48:57,529 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: *Transfer took
250.23s at 10280.59 KB/s*
2013-08-14 12:48:57,530 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Downloaded file
fsimage.ckpt_0000000000655178418 size 2634222089 bytes
The there are below info in StadnbyNameNode:
2013-08-14 12:43:57,924 INFO
org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer: Triggering
checkpoint because there have been 2421083 txns since the last checkpoint,
which exceeds the configured threshold 1000000
2013-08-14 12:43:57,925 INFO
org.apache.hadoop.hdfs.server.namenode.FSImage: Saving image file
/home/musa.ll/hadoop2/cluster-data/name/current/fsimage.ckpt_0000000000655178418
using no compression
2013-08-14 12:48:58,044 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
250.75s at 0.00 KB/s
2013-08-14 12:48:58,045 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Uploaded image with
txid 655178418 to namenode at 10.232.98.77:20021
*When Active NameNode download the fsimage, the OPS is less than 500, the
OPS dropped by 50%.*
The "checkpoint.png" attachment is monitoring info in ganglia.
I have two questions:
- *first question:* The Active NameNode's bandwidth is occupied by
50%, Active NameNode still has 50M bandwidth to be used to handle RPC
request, why the OPS dropped by 50%?
- *second question:* The fsimage file is 2634222089 bytes, Active
NameNode spend 240s to download the file. Yhe
"dfs.image.transfer.bandwidthPerSec" value is 50M, I think the download
time should is about 50s.
Thansk,
LiuLei
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Posted by Jitendra Yadav <je...@gmail.com>.
Hi,
I'm agreed with Harsh comment on image file compression and transfer
bandwidth parameter for optimizing checkpoint process.
In addition I'm not able to correlate your performance program log
timings(less then 10) and file transfer logs timing on active/stand by
nodes.
Thanks
On Tue, Aug 13, 2013 at 4:06 PM, Harsh J <ha...@cloudera.com> wrote:
> Perhaps turning on fsimage compression may help. See
> documentation of dfs.image.compress at
>
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
> .
>
> You can also try to throttle the bandwidth it uses via
> dfs.image.transfer.bandwidthPerSec.
>
> On Tue, Aug 13, 2013 at 3:45 PM, lei liu <li...@gmail.com> wrote:
> > I write one programm to test NameNode performance. Please see the
> > EditLogPerformance.java
> >
> > I use 60 threads to execute the EditLogPerformance.javacode, the testing
> > result is below content:
> > 2013-08-13 17:43:01,479 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10392810 speed:1055
> > 2013-08-13 17:43:11,482 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10407310 speed:725
> > 2013-08-13 17:43:21,484 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10407358 speed:2
> > 2013-08-13 17:43:31,487 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10407490 speed:6
> > 2013-08-13 17:43:41,490 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10407624 speed:6
> > 2013-08-13 17:43:51,493 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10408690 speed:53
> > 2013-08-13 17:44:01,496 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10422220 speed:676
> > 2013-08-13 17:44:11,499 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10445216 speed:1149
> > 2013-08-13 17:44:21,502 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10465166 speed:997
> > 2013-08-13 17:44:31,505 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10486614 speed:1072
> > 2013-08-13 17:44:41,508 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10506778 speed:1008
> > 2013-08-13 17:44:51,511 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10526660 speed:994
> > 2013-08-13 17:45:01,514 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10548092 speed:1071
> > 2013-08-13 17:45:11,517 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10569892 speed:1090
> > 2013-08-13 17:45:21,520 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10593296 speed:1170
> > 2013-08-13 17:45:31,523 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10614478 speed:1059
> > 2013-08-13 17:45:41,526 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10636006 speed:1076
> > 2013-08-13 17:45:51,529 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10656430 speed:1021
> > 2013-08-13 17:46:01,532 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10677774 speed:1067
> > 2013-08-13 17:46:11,534 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10699096 speed:1066
> > 2013-08-13 17:46:21,537 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10720970 speed:1093
> > 2013-08-13 17:46:31,540 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10741432 speed:1023
> > 2013-08-13 17:46:41,543 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10760854 speed:971
> > 2013-08-13 17:46:51,546 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10781680 speed:1041
> > 2013-08-13 17:47:01,549 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10802302 speed:1031
> > 2013-08-13 17:47:11,552 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10823888 speed:1079
> > 2013-08-13 17:47:21,555 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10845276 speed:1069
> > 2013-08-13 17:47:31,558 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10865470 speed:1009
> > 2013-08-13 17:47:41,561 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10885046 speed:978
> > 2013-08-13 17:47:51,564 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10905606 speed:1028
> > 2013-08-13 17:48:01,567 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10926854 speed:1062
> > 2013-08-13 17:48:11,570 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10946446 speed:979
> > 2013-08-13 17:48:21,573 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10966554 speed:1005
> > 2013-08-13 17:48:31,576 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10986794 speed:1012
> > 2013-08-13 17:48:41,579 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:11007484 speed:1034
> > 2013-08-13 17:48:51,581 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:11028400 speed:1045
> > 2013-08-13 17:49:01,584 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:11049312 speed:1045
> > 2013-08-13 17:49:11,587 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:11070396 speed:1054
> > 2013-08-13 17:49:21,590 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:11087408 speed:850
> > 2013-08-13 17:49:31,593 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:11087418 speed:0
> > 2013-08-13 17:49:41,596 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:11087546 speed:6
> > 2013-08-13 17:49:51,599 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:11087716 speed:8
> > 2013-08-13 17:50:01,602 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:11091608 speed:194
> >
> >
> >
> > The speed is less than ten sometimes. I find when Active NameNode
> download
> > the fsimage file, the speed is less than ten. So I think download
> fsimage
> > file that affects the performance of Active NameNode.
> >
> >
> > There are below info in Standby NameNode:
> > 2013-08-13 17:48:12,412 INFO
> > org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer: Triggering
> > checkpoint because there have been 2558038 txns since the last
> checkpoint,
> > which exceeds the configured threshold 1000000
> > 2013-08-13 17:48:12,413 INFO
> org.apache.hadoop.hdfs.server.namenode.FSImage:
> > Saving image file
> >
> /home/musa.ll/hadoop2/cluster-data/name/current/fsimage.ckpt_0000000000521186406
> > using no compression
> > 2013-08-13 17:49:19,085 INFO
> org.apache.hadoop.hdfs.server.namenode.FSImage:
> > Image file of size 3385425100 saved in 66 seconds.
> > 2013-08-13 17:49:19,655 INFO
> > org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening
> connection
> > to
> >
> http://10.232.98.77:20021/getimage?putimage=1&txid=521186406&port=20021&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
> > 2013-08-13 17:53:21,107 INFO
> > org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
> > 241.45s at 0.00 KB/s
> > 2013-08-13 17:53:21,107 INFO
> > org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Uploaded image
> with
> > txid 521186406 to namenode at 10.232.98.77:20021
> >
> >
> > There are below info in Active NameNode:
> > 2013-08-13 17:49:19,659 INFO
> > org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening
> connection
> > to
> >
> http://dw78.kgb.sqa.cm4:20021/getimage?getimage=1&txid=521186406&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
> > 2013-08-13 17:53:20,610 INFO
> > org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
> > 240.95s at 13720.96 KB/s
> > 2013-08-13 17:53:20,610 INFO
> > org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Downloaded file
> > fsimage.ckpt_0000000000521186406 size 3385425100 bytes.
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > 2013/8/13 Jitendra Yadav <je...@gmail.com>
> >>
> >> Hi,
> >>
> >> Can you please let me know that how you identified the slowness between
> >> primary and standby namnode?
> >>
> >> Also please share the network connection bandwidth between these two
> >> servers.
> >>
> >> Thanks
> >>
> >> On Tue, Aug 13, 2013 at 11:52 AM, lei liu <li...@gmail.com> wrote:
> >>>
> >>> The fsimage file size is 1658934155
> >>>
> >>>
> >>> 2013/8/13 Harsh J <ha...@cloudera.com>
> >>>>
> >>>> How large are your checkpointed fsimage files?
> >>>>
> >>>> On Mon, Aug 12, 2013 at 3:42 PM, lei liu <li...@gmail.com> wrote:
> >>>> > When Standby Namenode is doing checkpoint, upload the image file to
> >>>> > Active
> >>>> > NameNode, the Active NameNode is very slow. What is reason result to
> >>>> > the
> >>>> > Active NameNode is slow?
> >>>> >
> >>>> >
> >>>> > Thanks,
> >>>> >
> >>>> > LiuLei
> >>>> >
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Harsh J
> >>>
> >>>
> >>
> >
>
>
>
> --
> Harsh J
>
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Posted by Jitendra Yadav <je...@gmail.com>.
Hi,
I'm agreed with Harsh comment on image file compression and transfer
bandwidth parameter for optimizing checkpoint process.
In addition I'm not able to correlate your performance program log
timings(less then 10) and file transfer logs timing on active/stand by
nodes.
Thanks
On Tue, Aug 13, 2013 at 4:06 PM, Harsh J <ha...@cloudera.com> wrote:
> Perhaps turning on fsimage compression may help. See
> documentation of dfs.image.compress at
>
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
> .
>
> You can also try to throttle the bandwidth it uses via
> dfs.image.transfer.bandwidthPerSec.
>
> On Tue, Aug 13, 2013 at 3:45 PM, lei liu <li...@gmail.com> wrote:
> > I write one programm to test NameNode performance. Please see the
> > EditLogPerformance.java
> >
> > I use 60 threads to execute the EditLogPerformance.javacode, the testing
> > result is below content:
> > 2013-08-13 17:43:01,479 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10392810 speed:1055
> > 2013-08-13 17:43:11,482 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10407310 speed:725
> > 2013-08-13 17:43:21,484 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10407358 speed:2
> > 2013-08-13 17:43:31,487 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10407490 speed:6
> > 2013-08-13 17:43:41,490 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10407624 speed:6
> > 2013-08-13 17:43:51,493 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10408690 speed:53
> > 2013-08-13 17:44:01,496 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10422220 speed:676
> > 2013-08-13 17:44:11,499 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10445216 speed:1149
> > 2013-08-13 17:44:21,502 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10465166 speed:997
> > 2013-08-13 17:44:31,505 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10486614 speed:1072
> > 2013-08-13 17:44:41,508 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10506778 speed:1008
> > 2013-08-13 17:44:51,511 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10526660 speed:994
> > 2013-08-13 17:45:01,514 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10548092 speed:1071
> > 2013-08-13 17:45:11,517 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10569892 speed:1090
> > 2013-08-13 17:45:21,520 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10593296 speed:1170
> > 2013-08-13 17:45:31,523 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10614478 speed:1059
> > 2013-08-13 17:45:41,526 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10636006 speed:1076
> > 2013-08-13 17:45:51,529 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10656430 speed:1021
> > 2013-08-13 17:46:01,532 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10677774 speed:1067
> > 2013-08-13 17:46:11,534 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10699096 speed:1066
> > 2013-08-13 17:46:21,537 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10720970 speed:1093
> > 2013-08-13 17:46:31,540 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10741432 speed:1023
> > 2013-08-13 17:46:41,543 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10760854 speed:971
> > 2013-08-13 17:46:51,546 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10781680 speed:1041
> > 2013-08-13 17:47:01,549 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10802302 speed:1031
> > 2013-08-13 17:47:11,552 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10823888 speed:1079
> > 2013-08-13 17:47:21,555 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10845276 speed:1069
> > 2013-08-13 17:47:31,558 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10865470 speed:1009
> > 2013-08-13 17:47:41,561 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10885046 speed:978
> > 2013-08-13 17:47:51,564 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10905606 speed:1028
> > 2013-08-13 17:48:01,567 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10926854 speed:1062
> > 2013-08-13 17:48:11,570 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10946446 speed:979
> > 2013-08-13 17:48:21,573 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10966554 speed:1005
> > 2013-08-13 17:48:31,576 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10986794 speed:1012
> > 2013-08-13 17:48:41,579 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:11007484 speed:1034
> > 2013-08-13 17:48:51,581 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:11028400 speed:1045
> > 2013-08-13 17:49:01,584 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:11049312 speed:1045
> > 2013-08-13 17:49:11,587 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:11070396 speed:1054
> > 2013-08-13 17:49:21,590 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:11087408 speed:850
> > 2013-08-13 17:49:31,593 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:11087418 speed:0
> > 2013-08-13 17:49:41,596 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:11087546 speed:6
> > 2013-08-13 17:49:51,599 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:11087716 speed:8
> > 2013-08-13 17:50:01,602 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:11091608 speed:194
> >
> >
> >
> > The speed is less than ten sometimes. I find when Active NameNode
> download
> > the fsimage file, the speed is less than ten. So I think download
> fsimage
> > file that affects the performance of Active NameNode.
> >
> >
> > There are below info in Standby NameNode:
> > 2013-08-13 17:48:12,412 INFO
> > org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer: Triggering
> > checkpoint because there have been 2558038 txns since the last
> checkpoint,
> > which exceeds the configured threshold 1000000
> > 2013-08-13 17:48:12,413 INFO
> org.apache.hadoop.hdfs.server.namenode.FSImage:
> > Saving image file
> >
> /home/musa.ll/hadoop2/cluster-data/name/current/fsimage.ckpt_0000000000521186406
> > using no compression
> > 2013-08-13 17:49:19,085 INFO
> org.apache.hadoop.hdfs.server.namenode.FSImage:
> > Image file of size 3385425100 saved in 66 seconds.
> > 2013-08-13 17:49:19,655 INFO
> > org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening
> connection
> > to
> >
> http://10.232.98.77:20021/getimage?putimage=1&txid=521186406&port=20021&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
> > 2013-08-13 17:53:21,107 INFO
> > org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
> > 241.45s at 0.00 KB/s
> > 2013-08-13 17:53:21,107 INFO
> > org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Uploaded image
> with
> > txid 521186406 to namenode at 10.232.98.77:20021
> >
> >
> > There are below info in Active NameNode:
> > 2013-08-13 17:49:19,659 INFO
> > org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening
> connection
> > to
> >
> http://dw78.kgb.sqa.cm4:20021/getimage?getimage=1&txid=521186406&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
> > 2013-08-13 17:53:20,610 INFO
> > org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
> > 240.95s at 13720.96 KB/s
> > 2013-08-13 17:53:20,610 INFO
> > org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Downloaded file
> > fsimage.ckpt_0000000000521186406 size 3385425100 bytes.
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > 2013/8/13 Jitendra Yadav <je...@gmail.com>
> >>
> >> Hi,
> >>
> >> Can you please let me know that how you identified the slowness between
> >> primary and standby namnode?
> >>
> >> Also please share the network connection bandwidth between these two
> >> servers.
> >>
> >> Thanks
> >>
> >> On Tue, Aug 13, 2013 at 11:52 AM, lei liu <li...@gmail.com> wrote:
> >>>
> >>> The fsimage file size is 1658934155
> >>>
> >>>
> >>> 2013/8/13 Harsh J <ha...@cloudera.com>
> >>>>
> >>>> How large are your checkpointed fsimage files?
> >>>>
> >>>> On Mon, Aug 12, 2013 at 3:42 PM, lei liu <li...@gmail.com> wrote:
> >>>> > When Standby Namenode is doing checkpoint, upload the image file to
> >>>> > Active
> >>>> > NameNode, the Active NameNode is very slow. What is reason result to
> >>>> > the
> >>>> > Active NameNode is slow?
> >>>> >
> >>>> >
> >>>> > Thanks,
> >>>> >
> >>>> > LiuLei
> >>>> >
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Harsh J
> >>>
> >>>
> >>
> >
>
>
>
> --
> Harsh J
>
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Posted by Jitendra Yadav <je...@gmail.com>.
Hi,
I'm agreed with Harsh comment on image file compression and transfer
bandwidth parameter for optimizing checkpoint process.
In addition I'm not able to correlate your performance program log
timings(less then 10) and file transfer logs timing on active/stand by
nodes.
Thanks
On Tue, Aug 13, 2013 at 4:06 PM, Harsh J <ha...@cloudera.com> wrote:
> Perhaps turning on fsimage compression may help. See
> documentation of dfs.image.compress at
>
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
> .
>
> You can also try to throttle the bandwidth it uses via
> dfs.image.transfer.bandwidthPerSec.
>
> On Tue, Aug 13, 2013 at 3:45 PM, lei liu <li...@gmail.com> wrote:
> > I write one programm to test NameNode performance. Please see the
> > EditLogPerformance.java
> >
> > I use 60 threads to execute the EditLogPerformance.javacode, the testing
> > result is below content:
> > 2013-08-13 17:43:01,479 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10392810 speed:1055
> > 2013-08-13 17:43:11,482 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10407310 speed:725
> > 2013-08-13 17:43:21,484 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10407358 speed:2
> > 2013-08-13 17:43:31,487 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10407490 speed:6
> > 2013-08-13 17:43:41,490 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10407624 speed:6
> > 2013-08-13 17:43:51,493 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10408690 speed:53
> > 2013-08-13 17:44:01,496 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10422220 speed:676
> > 2013-08-13 17:44:11,499 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10445216 speed:1149
> > 2013-08-13 17:44:21,502 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10465166 speed:997
> > 2013-08-13 17:44:31,505 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10486614 speed:1072
> > 2013-08-13 17:44:41,508 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10506778 speed:1008
> > 2013-08-13 17:44:51,511 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10526660 speed:994
> > 2013-08-13 17:45:01,514 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10548092 speed:1071
> > 2013-08-13 17:45:11,517 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10569892 speed:1090
> > 2013-08-13 17:45:21,520 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10593296 speed:1170
> > 2013-08-13 17:45:31,523 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10614478 speed:1059
> > 2013-08-13 17:45:41,526 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10636006 speed:1076
> > 2013-08-13 17:45:51,529 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10656430 speed:1021
> > 2013-08-13 17:46:01,532 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10677774 speed:1067
> > 2013-08-13 17:46:11,534 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10699096 speed:1066
> > 2013-08-13 17:46:21,537 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10720970 speed:1093
> > 2013-08-13 17:46:31,540 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10741432 speed:1023
> > 2013-08-13 17:46:41,543 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10760854 speed:971
> > 2013-08-13 17:46:51,546 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10781680 speed:1041
> > 2013-08-13 17:47:01,549 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10802302 speed:1031
> > 2013-08-13 17:47:11,552 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10823888 speed:1079
> > 2013-08-13 17:47:21,555 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10845276 speed:1069
> > 2013-08-13 17:47:31,558 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10865470 speed:1009
> > 2013-08-13 17:47:41,561 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10885046 speed:978
> > 2013-08-13 17:47:51,564 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10905606 speed:1028
> > 2013-08-13 17:48:01,567 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10926854 speed:1062
> > 2013-08-13 17:48:11,570 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10946446 speed:979
> > 2013-08-13 17:48:21,573 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10966554 speed:1005
> > 2013-08-13 17:48:31,576 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10986794 speed:1012
> > 2013-08-13 17:48:41,579 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:11007484 speed:1034
> > 2013-08-13 17:48:51,581 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:11028400 speed:1045
> > 2013-08-13 17:49:01,584 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:11049312 speed:1045
> > 2013-08-13 17:49:11,587 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:11070396 speed:1054
> > 2013-08-13 17:49:21,590 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:11087408 speed:850
> > 2013-08-13 17:49:31,593 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:11087418 speed:0
> > 2013-08-13 17:49:41,596 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:11087546 speed:6
> > 2013-08-13 17:49:51,599 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:11087716 speed:8
> > 2013-08-13 17:50:01,602 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:11091608 speed:194
> >
> >
> >
> > The speed is less than ten sometimes. I find when Active NameNode
> download
> > the fsimage file, the speed is less than ten. So I think download
> fsimage
> > file that affects the performance of Active NameNode.
> >
> >
> > There are below info in Standby NameNode:
> > 2013-08-13 17:48:12,412 INFO
> > org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer: Triggering
> > checkpoint because there have been 2558038 txns since the last
> checkpoint,
> > which exceeds the configured threshold 1000000
> > 2013-08-13 17:48:12,413 INFO
> org.apache.hadoop.hdfs.server.namenode.FSImage:
> > Saving image file
> >
> /home/musa.ll/hadoop2/cluster-data/name/current/fsimage.ckpt_0000000000521186406
> > using no compression
> > 2013-08-13 17:49:19,085 INFO
> org.apache.hadoop.hdfs.server.namenode.FSImage:
> > Image file of size 3385425100 saved in 66 seconds.
> > 2013-08-13 17:49:19,655 INFO
> > org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening
> connection
> > to
> >
> http://10.232.98.77:20021/getimage?putimage=1&txid=521186406&port=20021&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
> > 2013-08-13 17:53:21,107 INFO
> > org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
> > 241.45s at 0.00 KB/s
> > 2013-08-13 17:53:21,107 INFO
> > org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Uploaded image
> with
> > txid 521186406 to namenode at 10.232.98.77:20021
> >
> >
> > There are below info in Active NameNode:
> > 2013-08-13 17:49:19,659 INFO
> > org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening
> connection
> > to
> >
> http://dw78.kgb.sqa.cm4:20021/getimage?getimage=1&txid=521186406&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
> > 2013-08-13 17:53:20,610 INFO
> > org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
> > 240.95s at 13720.96 KB/s
> > 2013-08-13 17:53:20,610 INFO
> > org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Downloaded file
> > fsimage.ckpt_0000000000521186406 size 3385425100 bytes.
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > 2013/8/13 Jitendra Yadav <je...@gmail.com>
> >>
> >> Hi,
> >>
> >> Can you please let me know that how you identified the slowness between
> >> primary and standby namnode?
> >>
> >> Also please share the network connection bandwidth between these two
> >> servers.
> >>
> >> Thanks
> >>
> >> On Tue, Aug 13, 2013 at 11:52 AM, lei liu <li...@gmail.com> wrote:
> >>>
> >>> The fsimage file size is 1658934155
> >>>
> >>>
> >>> 2013/8/13 Harsh J <ha...@cloudera.com>
> >>>>
> >>>> How large are your checkpointed fsimage files?
> >>>>
> >>>> On Mon, Aug 12, 2013 at 3:42 PM, lei liu <li...@gmail.com> wrote:
> >>>> > When Standby Namenode is doing checkpoint, upload the image file to
> >>>> > Active
> >>>> > NameNode, the Active NameNode is very slow. What is reason result to
> >>>> > the
> >>>> > Active NameNode is slow?
> >>>> >
> >>>> >
> >>>> > Thanks,
> >>>> >
> >>>> > LiuLei
> >>>> >
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Harsh J
> >>>
> >>>
> >>
> >
>
>
>
> --
> Harsh J
>
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Posted by Jitendra Yadav <je...@gmail.com>.
Hi,
I'm agreed with Harsh comment on image file compression and transfer
bandwidth parameter for optimizing checkpoint process.
In addition I'm not able to correlate your performance program log
timings(less then 10) and file transfer logs timing on active/stand by
nodes.
Thanks
On Tue, Aug 13, 2013 at 4:06 PM, Harsh J <ha...@cloudera.com> wrote:
> Perhaps turning on fsimage compression may help. See
> documentation of dfs.image.compress at
>
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
> .
>
> You can also try to throttle the bandwidth it uses via
> dfs.image.transfer.bandwidthPerSec.
>
> On Tue, Aug 13, 2013 at 3:45 PM, lei liu <li...@gmail.com> wrote:
> > I write one programm to test NameNode performance. Please see the
> > EditLogPerformance.java
> >
> > I use 60 threads to execute the EditLogPerformance.javacode, the testing
> > result is below content:
> > 2013-08-13 17:43:01,479 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10392810 speed:1055
> > 2013-08-13 17:43:11,482 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10407310 speed:725
> > 2013-08-13 17:43:21,484 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10407358 speed:2
> > 2013-08-13 17:43:31,487 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10407490 speed:6
> > 2013-08-13 17:43:41,490 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10407624 speed:6
> > 2013-08-13 17:43:51,493 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10408690 speed:53
> > 2013-08-13 17:44:01,496 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10422220 speed:676
> > 2013-08-13 17:44:11,499 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10445216 speed:1149
> > 2013-08-13 17:44:21,502 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10465166 speed:997
> > 2013-08-13 17:44:31,505 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10486614 speed:1072
> > 2013-08-13 17:44:41,508 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10506778 speed:1008
> > 2013-08-13 17:44:51,511 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10526660 speed:994
> > 2013-08-13 17:45:01,514 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10548092 speed:1071
> > 2013-08-13 17:45:11,517 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10569892 speed:1090
> > 2013-08-13 17:45:21,520 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10593296 speed:1170
> > 2013-08-13 17:45:31,523 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10614478 speed:1059
> > 2013-08-13 17:45:41,526 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10636006 speed:1076
> > 2013-08-13 17:45:51,529 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10656430 speed:1021
> > 2013-08-13 17:46:01,532 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10677774 speed:1067
> > 2013-08-13 17:46:11,534 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10699096 speed:1066
> > 2013-08-13 17:46:21,537 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10720970 speed:1093
> > 2013-08-13 17:46:31,540 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10741432 speed:1023
> > 2013-08-13 17:46:41,543 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10760854 speed:971
> > 2013-08-13 17:46:51,546 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10781680 speed:1041
> > 2013-08-13 17:47:01,549 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10802302 speed:1031
> > 2013-08-13 17:47:11,552 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10823888 speed:1079
> > 2013-08-13 17:47:21,555 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10845276 speed:1069
> > 2013-08-13 17:47:31,558 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10865470 speed:1009
> > 2013-08-13 17:47:41,561 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10885046 speed:978
> > 2013-08-13 17:47:51,564 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10905606 speed:1028
> > 2013-08-13 17:48:01,567 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10926854 speed:1062
> > 2013-08-13 17:48:11,570 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10946446 speed:979
> > 2013-08-13 17:48:21,573 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10966554 speed:1005
> > 2013-08-13 17:48:31,576 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:10986794 speed:1012
> > 2013-08-13 17:48:41,579 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:11007484 speed:1034
> > 2013-08-13 17:48:51,581 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:11028400 speed:1045
> > 2013-08-13 17:49:01,584 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:11049312 speed:1045
> > 2013-08-13 17:49:11,587 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:11070396 speed:1054
> > 2013-08-13 17:49:21,590 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:11087408 speed:850
> > 2013-08-13 17:49:31,593 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:11087418 speed:0
> > 2013-08-13 17:49:41,596 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:11087546 speed:6
> > 2013-08-13 17:49:51,599 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:11087716 speed:8
> > 2013-08-13 17:50:01,602 INFO my.EditLogPerformance
> > (EditLogPerformance.java:run(37)) - totalCount:11091608 speed:194
> >
> >
> >
> > The speed is less than ten sometimes. I find when Active NameNode
> download
> > the fsimage file, the speed is less than ten. So I think download
> fsimage
> > file that affects the performance of Active NameNode.
> >
> >
> > There are below info in Standby NameNode:
> > 2013-08-13 17:48:12,412 INFO
> > org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer: Triggering
> > checkpoint because there have been 2558038 txns since the last
> checkpoint,
> > which exceeds the configured threshold 1000000
> > 2013-08-13 17:48:12,413 INFO
> org.apache.hadoop.hdfs.server.namenode.FSImage:
> > Saving image file
> >
> /home/musa.ll/hadoop2/cluster-data/name/current/fsimage.ckpt_0000000000521186406
> > using no compression
> > 2013-08-13 17:49:19,085 INFO
> org.apache.hadoop.hdfs.server.namenode.FSImage:
> > Image file of size 3385425100 saved in 66 seconds.
> > 2013-08-13 17:49:19,655 INFO
> > org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening
> connection
> > to
> >
> http://10.232.98.77:20021/getimage?putimage=1&txid=521186406&port=20021&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
> > 2013-08-13 17:53:21,107 INFO
> > org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
> > 241.45s at 0.00 KB/s
> > 2013-08-13 17:53:21,107 INFO
> > org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Uploaded image
> with
> > txid 521186406 to namenode at 10.232.98.77:20021
> >
> >
> > There are below info in Active NameNode:
> > 2013-08-13 17:49:19,659 INFO
> > org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening
> connection
> > to
> >
> http://dw78.kgb.sqa.cm4:20021/getimage?getimage=1&txid=521186406&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
> > 2013-08-13 17:53:20,610 INFO
> > org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
> > 240.95s at 13720.96 KB/s
> > 2013-08-13 17:53:20,610 INFO
> > org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Downloaded file
> > fsimage.ckpt_0000000000521186406 size 3385425100 bytes.
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > 2013/8/13 Jitendra Yadav <je...@gmail.com>
> >>
> >> Hi,
> >>
> >> Can you please let me know that how you identified the slowness between
> >> primary and standby namnode?
> >>
> >> Also please share the network connection bandwidth between these two
> >> servers.
> >>
> >> Thanks
> >>
> >> On Tue, Aug 13, 2013 at 11:52 AM, lei liu <li...@gmail.com> wrote:
> >>>
> >>> The fsimage file size is 1658934155
> >>>
> >>>
> >>> 2013/8/13 Harsh J <ha...@cloudera.com>
> >>>>
> >>>> How large are your checkpointed fsimage files?
> >>>>
> >>>> On Mon, Aug 12, 2013 at 3:42 PM, lei liu <li...@gmail.com> wrote:
> >>>> > When Standby Namenode is doing checkpoint, upload the image file to
> >>>> > Active
> >>>> > NameNode, the Active NameNode is very slow. What is reason result to
> >>>> > the
> >>>> > Active NameNode is slow?
> >>>> >
> >>>> >
> >>>> > Thanks,
> >>>> >
> >>>> > LiuLei
> >>>> >
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Harsh J
> >>>
> >>>
> >>
> >
>
>
>
> --
> Harsh J
>
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Posted by Harsh J <ha...@cloudera.com>.
Perhaps turning on fsimage compression may help. See
documentation of dfs.image.compress at
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml.
You can also try to throttle the bandwidth it uses via
dfs.image.transfer.bandwidthPerSec.
On Tue, Aug 13, 2013 at 3:45 PM, lei liu <li...@gmail.com> wrote:
> I write one programm to test NameNode performance. Please see the
> EditLogPerformance.java
>
> I use 60 threads to execute the EditLogPerformance.javacode, the testing
> result is below content:
> 2013-08-13 17:43:01,479 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10392810 speed:1055
> 2013-08-13 17:43:11,482 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10407310 speed:725
> 2013-08-13 17:43:21,484 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10407358 speed:2
> 2013-08-13 17:43:31,487 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10407490 speed:6
> 2013-08-13 17:43:41,490 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10407624 speed:6
> 2013-08-13 17:43:51,493 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10408690 speed:53
> 2013-08-13 17:44:01,496 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10422220 speed:676
> 2013-08-13 17:44:11,499 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10445216 speed:1149
> 2013-08-13 17:44:21,502 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10465166 speed:997
> 2013-08-13 17:44:31,505 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10486614 speed:1072
> 2013-08-13 17:44:41,508 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10506778 speed:1008
> 2013-08-13 17:44:51,511 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10526660 speed:994
> 2013-08-13 17:45:01,514 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10548092 speed:1071
> 2013-08-13 17:45:11,517 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10569892 speed:1090
> 2013-08-13 17:45:21,520 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10593296 speed:1170
> 2013-08-13 17:45:31,523 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10614478 speed:1059
> 2013-08-13 17:45:41,526 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10636006 speed:1076
> 2013-08-13 17:45:51,529 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10656430 speed:1021
> 2013-08-13 17:46:01,532 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10677774 speed:1067
> 2013-08-13 17:46:11,534 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10699096 speed:1066
> 2013-08-13 17:46:21,537 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10720970 speed:1093
> 2013-08-13 17:46:31,540 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10741432 speed:1023
> 2013-08-13 17:46:41,543 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10760854 speed:971
> 2013-08-13 17:46:51,546 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10781680 speed:1041
> 2013-08-13 17:47:01,549 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10802302 speed:1031
> 2013-08-13 17:47:11,552 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10823888 speed:1079
> 2013-08-13 17:47:21,555 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10845276 speed:1069
> 2013-08-13 17:47:31,558 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10865470 speed:1009
> 2013-08-13 17:47:41,561 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10885046 speed:978
> 2013-08-13 17:47:51,564 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10905606 speed:1028
> 2013-08-13 17:48:01,567 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10926854 speed:1062
> 2013-08-13 17:48:11,570 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10946446 speed:979
> 2013-08-13 17:48:21,573 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10966554 speed:1005
> 2013-08-13 17:48:31,576 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10986794 speed:1012
> 2013-08-13 17:48:41,579 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:11007484 speed:1034
> 2013-08-13 17:48:51,581 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:11028400 speed:1045
> 2013-08-13 17:49:01,584 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:11049312 speed:1045
> 2013-08-13 17:49:11,587 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:11070396 speed:1054
> 2013-08-13 17:49:21,590 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:11087408 speed:850
> 2013-08-13 17:49:31,593 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:11087418 speed:0
> 2013-08-13 17:49:41,596 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:11087546 speed:6
> 2013-08-13 17:49:51,599 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:11087716 speed:8
> 2013-08-13 17:50:01,602 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:11091608 speed:194
>
>
>
> The speed is less than ten sometimes. I find when Active NameNode download
> the fsimage file, the speed is less than ten. So I think download fsimage
> file that affects the performance of Active NameNode.
>
>
> There are below info in Standby NameNode:
> 2013-08-13 17:48:12,412 INFO
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer: Triggering
> checkpoint because there have been 2558038 txns since the last checkpoint,
> which exceeds the configured threshold 1000000
> 2013-08-13 17:48:12,413 INFO org.apache.hadoop.hdfs.server.namenode.FSImage:
> Saving image file
> /home/musa.ll/hadoop2/cluster-data/name/current/fsimage.ckpt_0000000000521186406
> using no compression
> 2013-08-13 17:49:19,085 INFO org.apache.hadoop.hdfs.server.namenode.FSImage:
> Image file of size 3385425100 saved in 66 seconds.
> 2013-08-13 17:49:19,655 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening connection
> to
> http://10.232.98.77:20021/getimage?putimage=1&txid=521186406&port=20021&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
> 2013-08-13 17:53:21,107 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
> 241.45s at 0.00 KB/s
> 2013-08-13 17:53:21,107 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Uploaded image with
> txid 521186406 to namenode at 10.232.98.77:20021
>
>
> There are below info in Active NameNode:
> 2013-08-13 17:49:19,659 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening connection
> to
> http://dw78.kgb.sqa.cm4:20021/getimage?getimage=1&txid=521186406&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
> 2013-08-13 17:53:20,610 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
> 240.95s at 13720.96 KB/s
> 2013-08-13 17:53:20,610 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Downloaded file
> fsimage.ckpt_0000000000521186406 size 3385425100 bytes.
>
>
>
>
>
>
>
>
>
>
>
>
> 2013/8/13 Jitendra Yadav <je...@gmail.com>
>>
>> Hi,
>>
>> Can you please let me know that how you identified the slowness between
>> primary and standby namnode?
>>
>> Also please share the network connection bandwidth between these two
>> servers.
>>
>> Thanks
>>
>> On Tue, Aug 13, 2013 at 11:52 AM, lei liu <li...@gmail.com> wrote:
>>>
>>> The fsimage file size is 1658934155
>>>
>>>
>>> 2013/8/13 Harsh J <ha...@cloudera.com>
>>>>
>>>> How large are your checkpointed fsimage files?
>>>>
>>>> On Mon, Aug 12, 2013 at 3:42 PM, lei liu <li...@gmail.com> wrote:
>>>> > When Standby Namenode is doing checkpoint, upload the image file to
>>>> > Active
>>>> > NameNode, the Active NameNode is very slow. What is reason result to
>>>> > the
>>>> > Active NameNode is slow?
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > LiuLei
>>>> >
>>>>
>>>>
>>>>
>>>> --
>>>> Harsh J
>>>
>>>
>>
>
--
Harsh J
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Posted by Harsh J <ha...@cloudera.com>.
Perhaps turning on fsimage compression may help. See
documentation of dfs.image.compress at
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml.
You can also try to throttle the bandwidth it uses via
dfs.image.transfer.bandwidthPerSec.
On Tue, Aug 13, 2013 at 3:45 PM, lei liu <li...@gmail.com> wrote:
> I write one programm to test NameNode performance. Please see the
> EditLogPerformance.java
>
> I use 60 threads to execute the EditLogPerformance.javacode, the testing
> result is below content:
> 2013-08-13 17:43:01,479 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10392810 speed:1055
> 2013-08-13 17:43:11,482 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10407310 speed:725
> 2013-08-13 17:43:21,484 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10407358 speed:2
> 2013-08-13 17:43:31,487 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10407490 speed:6
> 2013-08-13 17:43:41,490 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10407624 speed:6
> 2013-08-13 17:43:51,493 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10408690 speed:53
> 2013-08-13 17:44:01,496 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10422220 speed:676
> 2013-08-13 17:44:11,499 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10445216 speed:1149
> 2013-08-13 17:44:21,502 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10465166 speed:997
> 2013-08-13 17:44:31,505 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10486614 speed:1072
> 2013-08-13 17:44:41,508 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10506778 speed:1008
> 2013-08-13 17:44:51,511 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10526660 speed:994
> 2013-08-13 17:45:01,514 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10548092 speed:1071
> 2013-08-13 17:45:11,517 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10569892 speed:1090
> 2013-08-13 17:45:21,520 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10593296 speed:1170
> 2013-08-13 17:45:31,523 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10614478 speed:1059
> 2013-08-13 17:45:41,526 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10636006 speed:1076
> 2013-08-13 17:45:51,529 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10656430 speed:1021
> 2013-08-13 17:46:01,532 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10677774 speed:1067
> 2013-08-13 17:46:11,534 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10699096 speed:1066
> 2013-08-13 17:46:21,537 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10720970 speed:1093
> 2013-08-13 17:46:31,540 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10741432 speed:1023
> 2013-08-13 17:46:41,543 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10760854 speed:971
> 2013-08-13 17:46:51,546 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10781680 speed:1041
> 2013-08-13 17:47:01,549 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10802302 speed:1031
> 2013-08-13 17:47:11,552 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10823888 speed:1079
> 2013-08-13 17:47:21,555 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10845276 speed:1069
> 2013-08-13 17:47:31,558 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10865470 speed:1009
> 2013-08-13 17:47:41,561 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10885046 speed:978
> 2013-08-13 17:47:51,564 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10905606 speed:1028
> 2013-08-13 17:48:01,567 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10926854 speed:1062
> 2013-08-13 17:48:11,570 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10946446 speed:979
> 2013-08-13 17:48:21,573 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10966554 speed:1005
> 2013-08-13 17:48:31,576 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10986794 speed:1012
> 2013-08-13 17:48:41,579 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:11007484 speed:1034
> 2013-08-13 17:48:51,581 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:11028400 speed:1045
> 2013-08-13 17:49:01,584 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:11049312 speed:1045
> 2013-08-13 17:49:11,587 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:11070396 speed:1054
> 2013-08-13 17:49:21,590 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:11087408 speed:850
> 2013-08-13 17:49:31,593 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:11087418 speed:0
> 2013-08-13 17:49:41,596 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:11087546 speed:6
> 2013-08-13 17:49:51,599 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:11087716 speed:8
> 2013-08-13 17:50:01,602 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:11091608 speed:194
>
>
>
> The speed is less than ten sometimes. I find when Active NameNode download
> the fsimage file, the speed is less than ten. So I think download fsimage
> file that affects the performance of Active NameNode.
>
>
> There are below info in Standby NameNode:
> 2013-08-13 17:48:12,412 INFO
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer: Triggering
> checkpoint because there have been 2558038 txns since the last checkpoint,
> which exceeds the configured threshold 1000000
> 2013-08-13 17:48:12,413 INFO org.apache.hadoop.hdfs.server.namenode.FSImage:
> Saving image file
> /home/musa.ll/hadoop2/cluster-data/name/current/fsimage.ckpt_0000000000521186406
> using no compression
> 2013-08-13 17:49:19,085 INFO org.apache.hadoop.hdfs.server.namenode.FSImage:
> Image file of size 3385425100 saved in 66 seconds.
> 2013-08-13 17:49:19,655 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening connection
> to
> http://10.232.98.77:20021/getimage?putimage=1&txid=521186406&port=20021&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
> 2013-08-13 17:53:21,107 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
> 241.45s at 0.00 KB/s
> 2013-08-13 17:53:21,107 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Uploaded image with
> txid 521186406 to namenode at 10.232.98.77:20021
>
>
> There are below info in Active NameNode:
> 2013-08-13 17:49:19,659 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening connection
> to
> http://dw78.kgb.sqa.cm4:20021/getimage?getimage=1&txid=521186406&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
> 2013-08-13 17:53:20,610 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
> 240.95s at 13720.96 KB/s
> 2013-08-13 17:53:20,610 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Downloaded file
> fsimage.ckpt_0000000000521186406 size 3385425100 bytes.
>
>
>
>
>
>
>
>
>
>
>
>
> 2013/8/13 Jitendra Yadav <je...@gmail.com>
>>
>> Hi,
>>
>> Can you please let me know that how you identified the slowness between
>> primary and standby namnode?
>>
>> Also please share the network connection bandwidth between these two
>> servers.
>>
>> Thanks
>>
>> On Tue, Aug 13, 2013 at 11:52 AM, lei liu <li...@gmail.com> wrote:
>>>
>>> The fsimage file size is 1658934155
>>>
>>>
>>> 2013/8/13 Harsh J <ha...@cloudera.com>
>>>>
>>>> How large are your checkpointed fsimage files?
>>>>
>>>> On Mon, Aug 12, 2013 at 3:42 PM, lei liu <li...@gmail.com> wrote:
>>>> > When Standby Namenode is doing checkpoint, upload the image file to
>>>> > Active
>>>> > NameNode, the Active NameNode is very slow. What is reason result to
>>>> > the
>>>> > Active NameNode is slow?
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > LiuLei
>>>> >
>>>>
>>>>
>>>>
>>>> --
>>>> Harsh J
>>>
>>>
>>
>
--
Harsh J
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Posted by Harsh J <ha...@cloudera.com>.
Perhaps turning on fsimage compression may help. See
documentation of dfs.image.compress at
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml.
You can also try to throttle the bandwidth it uses via
dfs.image.transfer.bandwidthPerSec.
On Tue, Aug 13, 2013 at 3:45 PM, lei liu <li...@gmail.com> wrote:
> I write one programm to test NameNode performance. Please see the
> EditLogPerformance.java
>
> I use 60 threads to execute the EditLogPerformance.javacode, the testing
> result is below content:
> 2013-08-13 17:43:01,479 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10392810 speed:1055
> 2013-08-13 17:43:11,482 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10407310 speed:725
> 2013-08-13 17:43:21,484 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10407358 speed:2
> 2013-08-13 17:43:31,487 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10407490 speed:6
> 2013-08-13 17:43:41,490 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10407624 speed:6
> 2013-08-13 17:43:51,493 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10408690 speed:53
> 2013-08-13 17:44:01,496 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10422220 speed:676
> 2013-08-13 17:44:11,499 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10445216 speed:1149
> 2013-08-13 17:44:21,502 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10465166 speed:997
> 2013-08-13 17:44:31,505 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10486614 speed:1072
> 2013-08-13 17:44:41,508 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10506778 speed:1008
> 2013-08-13 17:44:51,511 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10526660 speed:994
> 2013-08-13 17:45:01,514 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10548092 speed:1071
> 2013-08-13 17:45:11,517 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10569892 speed:1090
> 2013-08-13 17:45:21,520 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10593296 speed:1170
> 2013-08-13 17:45:31,523 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10614478 speed:1059
> 2013-08-13 17:45:41,526 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10636006 speed:1076
> 2013-08-13 17:45:51,529 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10656430 speed:1021
> 2013-08-13 17:46:01,532 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10677774 speed:1067
> 2013-08-13 17:46:11,534 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10699096 speed:1066
> 2013-08-13 17:46:21,537 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10720970 speed:1093
> 2013-08-13 17:46:31,540 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10741432 speed:1023
> 2013-08-13 17:46:41,543 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10760854 speed:971
> 2013-08-13 17:46:51,546 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10781680 speed:1041
> 2013-08-13 17:47:01,549 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10802302 speed:1031
> 2013-08-13 17:47:11,552 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10823888 speed:1079
> 2013-08-13 17:47:21,555 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10845276 speed:1069
> 2013-08-13 17:47:31,558 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10865470 speed:1009
> 2013-08-13 17:47:41,561 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10885046 speed:978
> 2013-08-13 17:47:51,564 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10905606 speed:1028
> 2013-08-13 17:48:01,567 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10926854 speed:1062
> 2013-08-13 17:48:11,570 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10946446 speed:979
> 2013-08-13 17:48:21,573 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10966554 speed:1005
> 2013-08-13 17:48:31,576 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10986794 speed:1012
> 2013-08-13 17:48:41,579 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:11007484 speed:1034
> 2013-08-13 17:48:51,581 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:11028400 speed:1045
> 2013-08-13 17:49:01,584 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:11049312 speed:1045
> 2013-08-13 17:49:11,587 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:11070396 speed:1054
> 2013-08-13 17:49:21,590 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:11087408 speed:850
> 2013-08-13 17:49:31,593 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:11087418 speed:0
> 2013-08-13 17:49:41,596 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:11087546 speed:6
> 2013-08-13 17:49:51,599 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:11087716 speed:8
> 2013-08-13 17:50:01,602 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:11091608 speed:194
>
>
>
> The speed is less than ten sometimes. I find when Active NameNode download
> the fsimage file, the speed is less than ten. So I think download fsimage
> file that affects the performance of Active NameNode.
>
>
> There are below info in Standby NameNode:
> 2013-08-13 17:48:12,412 INFO
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer: Triggering
> checkpoint because there have been 2558038 txns since the last checkpoint,
> which exceeds the configured threshold 1000000
> 2013-08-13 17:48:12,413 INFO org.apache.hadoop.hdfs.server.namenode.FSImage:
> Saving image file
> /home/musa.ll/hadoop2/cluster-data/name/current/fsimage.ckpt_0000000000521186406
> using no compression
> 2013-08-13 17:49:19,085 INFO org.apache.hadoop.hdfs.server.namenode.FSImage:
> Image file of size 3385425100 saved in 66 seconds.
> 2013-08-13 17:49:19,655 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening connection
> to
> http://10.232.98.77:20021/getimage?putimage=1&txid=521186406&port=20021&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
> 2013-08-13 17:53:21,107 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
> 241.45s at 0.00 KB/s
> 2013-08-13 17:53:21,107 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Uploaded image with
> txid 521186406 to namenode at 10.232.98.77:20021
>
>
> There are below info in Active NameNode:
> 2013-08-13 17:49:19,659 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening connection
> to
> http://dw78.kgb.sqa.cm4:20021/getimage?getimage=1&txid=521186406&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
> 2013-08-13 17:53:20,610 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
> 240.95s at 13720.96 KB/s
> 2013-08-13 17:53:20,610 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Downloaded file
> fsimage.ckpt_0000000000521186406 size 3385425100 bytes.
>
>
>
>
>
>
>
>
>
>
>
>
> 2013/8/13 Jitendra Yadav <je...@gmail.com>
>>
>> Hi,
>>
>> Can you please let me know that how you identified the slowness between
>> primary and standby namnode?
>>
>> Also please share the network connection bandwidth between these two
>> servers.
>>
>> Thanks
>>
>> On Tue, Aug 13, 2013 at 11:52 AM, lei liu <li...@gmail.com> wrote:
>>>
>>> The fsimage file size is 1658934155
>>>
>>>
>>> 2013/8/13 Harsh J <ha...@cloudera.com>
>>>>
>>>> How large are your checkpointed fsimage files?
>>>>
>>>> On Mon, Aug 12, 2013 at 3:42 PM, lei liu <li...@gmail.com> wrote:
>>>> > When Standby Namenode is doing checkpoint, upload the image file to
>>>> > Active
>>>> > NameNode, the Active NameNode is very slow. What is reason result to
>>>> > the
>>>> > Active NameNode is slow?
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > LiuLei
>>>> >
>>>>
>>>>
>>>>
>>>> --
>>>> Harsh J
>>>
>>>
>>
>
--
Harsh J
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Posted by Harsh J <ha...@cloudera.com>.
Perhaps turning on fsimage compression may help. See
documentation of dfs.image.compress at
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml.
You can also try to throttle the bandwidth it uses via
dfs.image.transfer.bandwidthPerSec.
On Tue, Aug 13, 2013 at 3:45 PM, lei liu <li...@gmail.com> wrote:
> I write one programm to test NameNode performance. Please see the
> EditLogPerformance.java
>
> I use 60 threads to execute the EditLogPerformance.javacode, the testing
> result is below content:
> 2013-08-13 17:43:01,479 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10392810 speed:1055
> 2013-08-13 17:43:11,482 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10407310 speed:725
> 2013-08-13 17:43:21,484 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10407358 speed:2
> 2013-08-13 17:43:31,487 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10407490 speed:6
> 2013-08-13 17:43:41,490 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10407624 speed:6
> 2013-08-13 17:43:51,493 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10408690 speed:53
> 2013-08-13 17:44:01,496 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10422220 speed:676
> 2013-08-13 17:44:11,499 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10445216 speed:1149
> 2013-08-13 17:44:21,502 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10465166 speed:997
> 2013-08-13 17:44:31,505 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10486614 speed:1072
> 2013-08-13 17:44:41,508 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10506778 speed:1008
> 2013-08-13 17:44:51,511 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10526660 speed:994
> 2013-08-13 17:45:01,514 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10548092 speed:1071
> 2013-08-13 17:45:11,517 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10569892 speed:1090
> 2013-08-13 17:45:21,520 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10593296 speed:1170
> 2013-08-13 17:45:31,523 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10614478 speed:1059
> 2013-08-13 17:45:41,526 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10636006 speed:1076
> 2013-08-13 17:45:51,529 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10656430 speed:1021
> 2013-08-13 17:46:01,532 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10677774 speed:1067
> 2013-08-13 17:46:11,534 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10699096 speed:1066
> 2013-08-13 17:46:21,537 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10720970 speed:1093
> 2013-08-13 17:46:31,540 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10741432 speed:1023
> 2013-08-13 17:46:41,543 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10760854 speed:971
> 2013-08-13 17:46:51,546 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10781680 speed:1041
> 2013-08-13 17:47:01,549 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10802302 speed:1031
> 2013-08-13 17:47:11,552 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10823888 speed:1079
> 2013-08-13 17:47:21,555 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10845276 speed:1069
> 2013-08-13 17:47:31,558 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10865470 speed:1009
> 2013-08-13 17:47:41,561 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10885046 speed:978
> 2013-08-13 17:47:51,564 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10905606 speed:1028
> 2013-08-13 17:48:01,567 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10926854 speed:1062
> 2013-08-13 17:48:11,570 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10946446 speed:979
> 2013-08-13 17:48:21,573 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10966554 speed:1005
> 2013-08-13 17:48:31,576 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:10986794 speed:1012
> 2013-08-13 17:48:41,579 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:11007484 speed:1034
> 2013-08-13 17:48:51,581 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:11028400 speed:1045
> 2013-08-13 17:49:01,584 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:11049312 speed:1045
> 2013-08-13 17:49:11,587 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:11070396 speed:1054
> 2013-08-13 17:49:21,590 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:11087408 speed:850
> 2013-08-13 17:49:31,593 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:11087418 speed:0
> 2013-08-13 17:49:41,596 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:11087546 speed:6
> 2013-08-13 17:49:51,599 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:11087716 speed:8
> 2013-08-13 17:50:01,602 INFO my.EditLogPerformance
> (EditLogPerformance.java:run(37)) - totalCount:11091608 speed:194
>
>
>
> The speed is less than ten sometimes. I find when Active NameNode download
> the fsimage file, the speed is less than ten. So I think download fsimage
> file that affects the performance of Active NameNode.
>
>
> There are below info in Standby NameNode:
> 2013-08-13 17:48:12,412 INFO
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer: Triggering
> checkpoint because there have been 2558038 txns since the last checkpoint,
> which exceeds the configured threshold 1000000
> 2013-08-13 17:48:12,413 INFO org.apache.hadoop.hdfs.server.namenode.FSImage:
> Saving image file
> /home/musa.ll/hadoop2/cluster-data/name/current/fsimage.ckpt_0000000000521186406
> using no compression
> 2013-08-13 17:49:19,085 INFO org.apache.hadoop.hdfs.server.namenode.FSImage:
> Image file of size 3385425100 saved in 66 seconds.
> 2013-08-13 17:49:19,655 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening connection
> to
> http://10.232.98.77:20021/getimage?putimage=1&txid=521186406&port=20021&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
> 2013-08-13 17:53:21,107 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
> 241.45s at 0.00 KB/s
> 2013-08-13 17:53:21,107 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Uploaded image with
> txid 521186406 to namenode at 10.232.98.77:20021
>
>
> There are below info in Active NameNode:
> 2013-08-13 17:49:19,659 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening connection
> to
> http://dw78.kgb.sqa.cm4:20021/getimage?getimage=1&txid=521186406&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
> 2013-08-13 17:53:20,610 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
> 240.95s at 13720.96 KB/s
> 2013-08-13 17:53:20,610 INFO
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Downloaded file
> fsimage.ckpt_0000000000521186406 size 3385425100 bytes.
>
>
>
>
>
>
>
>
>
>
>
>
> 2013/8/13 Jitendra Yadav <je...@gmail.com>
>>
>> Hi,
>>
>> Can you please let me know that how you identified the slowness between
>> primary and standby namnode?
>>
>> Also please share the network connection bandwidth between these two
>> servers.
>>
>> Thanks
>>
>> On Tue, Aug 13, 2013 at 11:52 AM, lei liu <li...@gmail.com> wrote:
>>>
>>> The fsimage file size is 1658934155
>>>
>>>
>>> 2013/8/13 Harsh J <ha...@cloudera.com>
>>>>
>>>> How large are your checkpointed fsimage files?
>>>>
>>>> On Mon, Aug 12, 2013 at 3:42 PM, lei liu <li...@gmail.com> wrote:
>>>> > When Standby Namenode is doing checkpoint, upload the image file to
>>>> > Active
>>>> > NameNode, the Active NameNode is very slow. What is reason result to
>>>> > the
>>>> > Active NameNode is slow?
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > LiuLei
>>>> >
>>>>
>>>>
>>>>
>>>> --
>>>> Harsh J
>>>
>>>
>>
>
--
Harsh J
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Posted by lei liu <li...@gmail.com>.
I write one programm to test NameNode performance. Please see the
EditLogPerformance.java
I use 60 threads to execute the EditLogPerformance.javacode, the testing
result is below content:
2013-08-13 17:43:01,479 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10392810 speed:1055
2013-08-13 17:43:11,482 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10407310 speed:725
2013-08-13 17:43:21,484 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10407358 speed:2
2013-08-13 17:43:31,487 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10407490 speed:6
2013-08-13 17:43:41,490 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10407624 speed:6
2013-08-13 17:43:51,493 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10408690 speed:53
2013-08-13 17:44:01,496 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10422220 speed:676
2013-08-13 17:44:11,499 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10445216 speed:1149
2013-08-13 17:44:21,502 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10465166 speed:997
2013-08-13 17:44:31,505 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10486614 speed:1072
2013-08-13 17:44:41,508 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10506778 speed:1008
2013-08-13 17:44:51,511 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10526660 speed:994
2013-08-13 17:45:01,514 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10548092 speed:1071
2013-08-13 17:45:11,517 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10569892 speed:1090
2013-08-13 17:45:21,520 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10593296 speed:1170
2013-08-13 17:45:31,523 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10614478 speed:1059
2013-08-13 17:45:41,526 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10636006 speed:1076
2013-08-13 17:45:51,529 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10656430 speed:1021
2013-08-13 17:46:01,532 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10677774 speed:1067
2013-08-13 17:46:11,534 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10699096 speed:1066
2013-08-13 17:46:21,537 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10720970 speed:1093
2013-08-13 17:46:31,540 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10741432 speed:1023
2013-08-13 17:46:41,543 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10760854 speed:971
2013-08-13 17:46:51,546 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10781680 speed:1041
2013-08-13 17:47:01,549 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10802302 speed:1031
2013-08-13 17:47:11,552 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10823888 speed:1079
2013-08-13 17:47:21,555 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10845276 speed:1069
2013-08-13 17:47:31,558 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10865470 speed:1009
2013-08-13 17:47:41,561 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10885046 speed:978
2013-08-13 17:47:51,564 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10905606 speed:1028
2013-08-13 17:48:01,567 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10926854 speed:1062
2013-08-13 17:48:11,570 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10946446 speed:979
2013-08-13 17:48:21,573 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10966554 speed:1005
2013-08-13 17:48:31,576 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10986794 speed:1012
2013-08-13 17:48:41,579 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11007484 speed:1034
2013-08-13 17:48:51,581 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11028400 speed:1045
2013-08-13 17:49:01,584 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11049312 speed:1045
2013-08-13 17:49:11,587 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11070396 speed:1054
2013-08-13 17:49:21,590 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11087408 speed:850
2013-08-13 17:49:31,593 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11087418 speed:0
2013-08-13 17:49:41,596 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11087546 speed:6
2013-08-13 17:49:51,599 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11087716 speed:8
2013-08-13 17:50:01,602 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11091608 speed:194
The speed is less than ten sometimes. I find when Active NameNode download
the fsimage file, the speed is less than ten. So I think download fsimage
file that affects the performance of Active NameNode.
There are below info in Standby NameNode:
2013-08-13 17:48:12,412 INFO
org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer: Triggering
checkpoint because there have been 2558038 txns since the last checkpoint,
which exceeds the configured threshold 1000000
2013-08-13 17:48:12,413 INFO
org.apache.hadoop.hdfs.server.namenode.FSImage: Saving image file
/home/musa.ll/hadoop2/cluster-data/name/current/fsimage.ckpt_0000000000521186406
using no compression
2013-08-13 17:49:19,085 INFO
org.apache.hadoop.hdfs.server.namenode.FSImage: Image file of size
3385425100 saved in 66 seconds.
2013-08-13 17:49:19,655 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening connection
to
http://10.232.98.77:20021/getimage?putimage=1&txid=521186406&port=20021&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
2013-08-13 17:53:21,107 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
241.45s at 0.00 KB/s
2013-08-13 17:53:21,107 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Uploaded image with
txid 521186406 to namenode at 10.232.98.77:20021
There are below info in Active NameNode:
2013-08-13 17:49:19,659 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening connection
to
http://dw78.kgb.sqa.cm4:20021/getimage?getimage=1&txid=521186406&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
2013-08-13 17:53:20,610 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
240.95s at 13720.96 KB/s
2013-08-13 17:53:20,610 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Downloaded file
fsimage.ckpt_0000000000521186406 size 3385425100 bytes.
2013/8/13 Jitendra Yadav <je...@gmail.com>
> Hi,
>
> Can you please let me know that how you identified the slowness between
> primary and standby namnode?
>
> Also please share the network connection bandwidth between these two
> servers.
>
> Thanks
>
> On Tue, Aug 13, 2013 at 11:52 AM, lei liu <li...@gmail.com> wrote:
>
>> The fsimage file size is 1658934155
>>
>>
>> 2013/8/13 Harsh J <ha...@cloudera.com>
>>
>>> How large are your checkpointed fsimage files?
>>>
>>> On Mon, Aug 12, 2013 at 3:42 PM, lei liu <li...@gmail.com> wrote:
>>> > When Standby Namenode is doing checkpoint, upload the image file to
>>> Active
>>> > NameNode, the Active NameNode is very slow. What is reason result to
>>> the
>>> > Active NameNode is slow?
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > LiuLei
>>> >
>>>
>>>
>>>
>>> --
>>> Harsh J
>>>
>>
>>
>
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Posted by lei liu <li...@gmail.com>.
I write one programm to test NameNode performance. Please see the
EditLogPerformance.java
I use 60 threads to execute the EditLogPerformance.javacode, the testing
result is below content:
2013-08-13 17:43:01,479 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10392810 speed:1055
2013-08-13 17:43:11,482 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10407310 speed:725
2013-08-13 17:43:21,484 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10407358 speed:2
2013-08-13 17:43:31,487 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10407490 speed:6
2013-08-13 17:43:41,490 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10407624 speed:6
2013-08-13 17:43:51,493 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10408690 speed:53
2013-08-13 17:44:01,496 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10422220 speed:676
2013-08-13 17:44:11,499 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10445216 speed:1149
2013-08-13 17:44:21,502 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10465166 speed:997
2013-08-13 17:44:31,505 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10486614 speed:1072
2013-08-13 17:44:41,508 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10506778 speed:1008
2013-08-13 17:44:51,511 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10526660 speed:994
2013-08-13 17:45:01,514 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10548092 speed:1071
2013-08-13 17:45:11,517 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10569892 speed:1090
2013-08-13 17:45:21,520 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10593296 speed:1170
2013-08-13 17:45:31,523 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10614478 speed:1059
2013-08-13 17:45:41,526 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10636006 speed:1076
2013-08-13 17:45:51,529 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10656430 speed:1021
2013-08-13 17:46:01,532 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10677774 speed:1067
2013-08-13 17:46:11,534 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10699096 speed:1066
2013-08-13 17:46:21,537 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10720970 speed:1093
2013-08-13 17:46:31,540 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10741432 speed:1023
2013-08-13 17:46:41,543 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10760854 speed:971
2013-08-13 17:46:51,546 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10781680 speed:1041
2013-08-13 17:47:01,549 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10802302 speed:1031
2013-08-13 17:47:11,552 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10823888 speed:1079
2013-08-13 17:47:21,555 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10845276 speed:1069
2013-08-13 17:47:31,558 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10865470 speed:1009
2013-08-13 17:47:41,561 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10885046 speed:978
2013-08-13 17:47:51,564 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10905606 speed:1028
2013-08-13 17:48:01,567 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10926854 speed:1062
2013-08-13 17:48:11,570 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10946446 speed:979
2013-08-13 17:48:21,573 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10966554 speed:1005
2013-08-13 17:48:31,576 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10986794 speed:1012
2013-08-13 17:48:41,579 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11007484 speed:1034
2013-08-13 17:48:51,581 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11028400 speed:1045
2013-08-13 17:49:01,584 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11049312 speed:1045
2013-08-13 17:49:11,587 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11070396 speed:1054
2013-08-13 17:49:21,590 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11087408 speed:850
2013-08-13 17:49:31,593 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11087418 speed:0
2013-08-13 17:49:41,596 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11087546 speed:6
2013-08-13 17:49:51,599 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11087716 speed:8
2013-08-13 17:50:01,602 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11091608 speed:194
The speed is less than ten sometimes. I find when Active NameNode download
the fsimage file, the speed is less than ten. So I think download fsimage
file that affects the performance of Active NameNode.
There are below info in Standby NameNode:
2013-08-13 17:48:12,412 INFO
org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer: Triggering
checkpoint because there have been 2558038 txns since the last checkpoint,
which exceeds the configured threshold 1000000
2013-08-13 17:48:12,413 INFO
org.apache.hadoop.hdfs.server.namenode.FSImage: Saving image file
/home/musa.ll/hadoop2/cluster-data/name/current/fsimage.ckpt_0000000000521186406
using no compression
2013-08-13 17:49:19,085 INFO
org.apache.hadoop.hdfs.server.namenode.FSImage: Image file of size
3385425100 saved in 66 seconds.
2013-08-13 17:49:19,655 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening connection
to
http://10.232.98.77:20021/getimage?putimage=1&txid=521186406&port=20021&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
2013-08-13 17:53:21,107 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
241.45s at 0.00 KB/s
2013-08-13 17:53:21,107 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Uploaded image with
txid 521186406 to namenode at 10.232.98.77:20021
There are below info in Active NameNode:
2013-08-13 17:49:19,659 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening connection
to
http://dw78.kgb.sqa.cm4:20021/getimage?getimage=1&txid=521186406&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
2013-08-13 17:53:20,610 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
240.95s at 13720.96 KB/s
2013-08-13 17:53:20,610 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Downloaded file
fsimage.ckpt_0000000000521186406 size 3385425100 bytes.
2013/8/13 Jitendra Yadav <je...@gmail.com>
> Hi,
>
> Can you please let me know that how you identified the slowness between
> primary and standby namnode?
>
> Also please share the network connection bandwidth between these two
> servers.
>
> Thanks
>
> On Tue, Aug 13, 2013 at 11:52 AM, lei liu <li...@gmail.com> wrote:
>
>> The fsimage file size is 1658934155
>>
>>
>> 2013/8/13 Harsh J <ha...@cloudera.com>
>>
>>> How large are your checkpointed fsimage files?
>>>
>>> On Mon, Aug 12, 2013 at 3:42 PM, lei liu <li...@gmail.com> wrote:
>>> > When Standby Namenode is doing checkpoint, upload the image file to
>>> Active
>>> > NameNode, the Active NameNode is very slow. What is reason result to
>>> the
>>> > Active NameNode is slow?
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > LiuLei
>>> >
>>>
>>>
>>>
>>> --
>>> Harsh J
>>>
>>
>>
>
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Posted by lei liu <li...@gmail.com>.
I write one programm to test NameNode performance. Please see the
EditLogPerformance.java
I use 60 threads to execute the EditLogPerformance.javacode, the testing
result is below content:
2013-08-13 17:43:01,479 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10392810 speed:1055
2013-08-13 17:43:11,482 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10407310 speed:725
2013-08-13 17:43:21,484 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10407358 speed:2
2013-08-13 17:43:31,487 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10407490 speed:6
2013-08-13 17:43:41,490 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10407624 speed:6
2013-08-13 17:43:51,493 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10408690 speed:53
2013-08-13 17:44:01,496 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10422220 speed:676
2013-08-13 17:44:11,499 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10445216 speed:1149
2013-08-13 17:44:21,502 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10465166 speed:997
2013-08-13 17:44:31,505 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10486614 speed:1072
2013-08-13 17:44:41,508 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10506778 speed:1008
2013-08-13 17:44:51,511 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10526660 speed:994
2013-08-13 17:45:01,514 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10548092 speed:1071
2013-08-13 17:45:11,517 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10569892 speed:1090
2013-08-13 17:45:21,520 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10593296 speed:1170
2013-08-13 17:45:31,523 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10614478 speed:1059
2013-08-13 17:45:41,526 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10636006 speed:1076
2013-08-13 17:45:51,529 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10656430 speed:1021
2013-08-13 17:46:01,532 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10677774 speed:1067
2013-08-13 17:46:11,534 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10699096 speed:1066
2013-08-13 17:46:21,537 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10720970 speed:1093
2013-08-13 17:46:31,540 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10741432 speed:1023
2013-08-13 17:46:41,543 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10760854 speed:971
2013-08-13 17:46:51,546 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10781680 speed:1041
2013-08-13 17:47:01,549 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10802302 speed:1031
2013-08-13 17:47:11,552 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10823888 speed:1079
2013-08-13 17:47:21,555 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10845276 speed:1069
2013-08-13 17:47:31,558 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10865470 speed:1009
2013-08-13 17:47:41,561 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10885046 speed:978
2013-08-13 17:47:51,564 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10905606 speed:1028
2013-08-13 17:48:01,567 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10926854 speed:1062
2013-08-13 17:48:11,570 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10946446 speed:979
2013-08-13 17:48:21,573 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10966554 speed:1005
2013-08-13 17:48:31,576 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10986794 speed:1012
2013-08-13 17:48:41,579 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11007484 speed:1034
2013-08-13 17:48:51,581 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11028400 speed:1045
2013-08-13 17:49:01,584 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11049312 speed:1045
2013-08-13 17:49:11,587 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11070396 speed:1054
2013-08-13 17:49:21,590 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11087408 speed:850
2013-08-13 17:49:31,593 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11087418 speed:0
2013-08-13 17:49:41,596 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11087546 speed:6
2013-08-13 17:49:51,599 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11087716 speed:8
2013-08-13 17:50:01,602 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11091608 speed:194
The speed is less than ten sometimes. I find when Active NameNode download
the fsimage file, the speed is less than ten. So I think download fsimage
file that affects the performance of Active NameNode.
There are below info in Standby NameNode:
2013-08-13 17:48:12,412 INFO
org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer: Triggering
checkpoint because there have been 2558038 txns since the last checkpoint,
which exceeds the configured threshold 1000000
2013-08-13 17:48:12,413 INFO
org.apache.hadoop.hdfs.server.namenode.FSImage: Saving image file
/home/musa.ll/hadoop2/cluster-data/name/current/fsimage.ckpt_0000000000521186406
using no compression
2013-08-13 17:49:19,085 INFO
org.apache.hadoop.hdfs.server.namenode.FSImage: Image file of size
3385425100 saved in 66 seconds.
2013-08-13 17:49:19,655 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening connection
to
http://10.232.98.77:20021/getimage?putimage=1&txid=521186406&port=20021&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
2013-08-13 17:53:21,107 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
241.45s at 0.00 KB/s
2013-08-13 17:53:21,107 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Uploaded image with
txid 521186406 to namenode at 10.232.98.77:20021
There are below info in Active NameNode:
2013-08-13 17:49:19,659 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening connection
to
http://dw78.kgb.sqa.cm4:20021/getimage?getimage=1&txid=521186406&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
2013-08-13 17:53:20,610 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
240.95s at 13720.96 KB/s
2013-08-13 17:53:20,610 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Downloaded file
fsimage.ckpt_0000000000521186406 size 3385425100 bytes.
2013/8/13 Jitendra Yadav <je...@gmail.com>
> Hi,
>
> Can you please let me know that how you identified the slowness between
> primary and standby namnode?
>
> Also please share the network connection bandwidth between these two
> servers.
>
> Thanks
>
> On Tue, Aug 13, 2013 at 11:52 AM, lei liu <li...@gmail.com> wrote:
>
>> The fsimage file size is 1658934155
>>
>>
>> 2013/8/13 Harsh J <ha...@cloudera.com>
>>
>>> How large are your checkpointed fsimage files?
>>>
>>> On Mon, Aug 12, 2013 at 3:42 PM, lei liu <li...@gmail.com> wrote:
>>> > When Standby Namenode is doing checkpoint, upload the image file to
>>> Active
>>> > NameNode, the Active NameNode is very slow. What is reason result to
>>> the
>>> > Active NameNode is slow?
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > LiuLei
>>> >
>>>
>>>
>>>
>>> --
>>> Harsh J
>>>
>>
>>
>
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Posted by lei liu <li...@gmail.com>.
I write one programm to test NameNode performance. Please see the
EditLogPerformance.java
I use 60 threads to execute the EditLogPerformance.javacode, the testing
result is below content:
2013-08-13 17:43:01,479 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10392810 speed:1055
2013-08-13 17:43:11,482 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10407310 speed:725
2013-08-13 17:43:21,484 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10407358 speed:2
2013-08-13 17:43:31,487 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10407490 speed:6
2013-08-13 17:43:41,490 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10407624 speed:6
2013-08-13 17:43:51,493 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10408690 speed:53
2013-08-13 17:44:01,496 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10422220 speed:676
2013-08-13 17:44:11,499 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10445216 speed:1149
2013-08-13 17:44:21,502 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10465166 speed:997
2013-08-13 17:44:31,505 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10486614 speed:1072
2013-08-13 17:44:41,508 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10506778 speed:1008
2013-08-13 17:44:51,511 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10526660 speed:994
2013-08-13 17:45:01,514 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10548092 speed:1071
2013-08-13 17:45:11,517 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10569892 speed:1090
2013-08-13 17:45:21,520 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10593296 speed:1170
2013-08-13 17:45:31,523 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10614478 speed:1059
2013-08-13 17:45:41,526 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10636006 speed:1076
2013-08-13 17:45:51,529 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10656430 speed:1021
2013-08-13 17:46:01,532 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10677774 speed:1067
2013-08-13 17:46:11,534 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10699096 speed:1066
2013-08-13 17:46:21,537 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10720970 speed:1093
2013-08-13 17:46:31,540 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10741432 speed:1023
2013-08-13 17:46:41,543 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10760854 speed:971
2013-08-13 17:46:51,546 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10781680 speed:1041
2013-08-13 17:47:01,549 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10802302 speed:1031
2013-08-13 17:47:11,552 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10823888 speed:1079
2013-08-13 17:47:21,555 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10845276 speed:1069
2013-08-13 17:47:31,558 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10865470 speed:1009
2013-08-13 17:47:41,561 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10885046 speed:978
2013-08-13 17:47:51,564 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10905606 speed:1028
2013-08-13 17:48:01,567 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10926854 speed:1062
2013-08-13 17:48:11,570 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10946446 speed:979
2013-08-13 17:48:21,573 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10966554 speed:1005
2013-08-13 17:48:31,576 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10986794 speed:1012
2013-08-13 17:48:41,579 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11007484 speed:1034
2013-08-13 17:48:51,581 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11028400 speed:1045
2013-08-13 17:49:01,584 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11049312 speed:1045
2013-08-13 17:49:11,587 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11070396 speed:1054
2013-08-13 17:49:21,590 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11087408 speed:850
2013-08-13 17:49:31,593 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11087418 speed:0
2013-08-13 17:49:41,596 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11087546 speed:6
2013-08-13 17:49:51,599 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11087716 speed:8
2013-08-13 17:50:01,602 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:11091608 speed:194
The speed is less than ten sometimes. I find when Active NameNode download
the fsimage file, the speed is less than ten. So I think download fsimage
file that affects the performance of Active NameNode.
There are below info in Standby NameNode:
2013-08-13 17:48:12,412 INFO
org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer: Triggering
checkpoint because there have been 2558038 txns since the last checkpoint,
which exceeds the configured threshold 1000000
2013-08-13 17:48:12,413 INFO
org.apache.hadoop.hdfs.server.namenode.FSImage: Saving image file
/home/musa.ll/hadoop2/cluster-data/name/current/fsimage.ckpt_0000000000521186406
using no compression
2013-08-13 17:49:19,085 INFO
org.apache.hadoop.hdfs.server.namenode.FSImage: Image file of size
3385425100 saved in 66 seconds.
2013-08-13 17:49:19,655 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening connection
to
http://10.232.98.77:20021/getimage?putimage=1&txid=521186406&port=20021&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
2013-08-13 17:53:21,107 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
241.45s at 0.00 KB/s
2013-08-13 17:53:21,107 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Uploaded image with
txid 521186406 to namenode at 10.232.98.77:20021
There are below info in Active NameNode:
2013-08-13 17:49:19,659 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening connection
to
http://dw78.kgb.sqa.cm4:20021/getimage?getimage=1&txid=521186406&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600
2013-08-13 17:53:20,610 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took
240.95s at 13720.96 KB/s
2013-08-13 17:53:20,610 INFO
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Downloaded file
fsimage.ckpt_0000000000521186406 size 3385425100 bytes.
2013/8/13 Jitendra Yadav <je...@gmail.com>
> Hi,
>
> Can you please let me know that how you identified the slowness between
> primary and standby namnode?
>
> Also please share the network connection bandwidth between these two
> servers.
>
> Thanks
>
> On Tue, Aug 13, 2013 at 11:52 AM, lei liu <li...@gmail.com> wrote:
>
>> The fsimage file size is 1658934155
>>
>>
>> 2013/8/13 Harsh J <ha...@cloudera.com>
>>
>>> How large are your checkpointed fsimage files?
>>>
>>> On Mon, Aug 12, 2013 at 3:42 PM, lei liu <li...@gmail.com> wrote:
>>> > When Standby Namenode is doing checkpoint, upload the image file to
>>> Active
>>> > NameNode, the Active NameNode is very slow. What is reason result to
>>> the
>>> > Active NameNode is slow?
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > LiuLei
>>> >
>>>
>>>
>>>
>>> --
>>> Harsh J
>>>
>>
>>
>
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Posted by Jitendra Yadav <je...@gmail.com>.
Hi,
Can you please let me know that how you identified the slowness between
primary and standby namnode?
Also please share the network connection bandwidth between these two
servers.
Thanks
On Tue, Aug 13, 2013 at 11:52 AM, lei liu <li...@gmail.com> wrote:
> The fsimage file size is 1658934155
>
>
> 2013/8/13 Harsh J <ha...@cloudera.com>
>
>> How large are your checkpointed fsimage files?
>>
>> On Mon, Aug 12, 2013 at 3:42 PM, lei liu <li...@gmail.com> wrote:
>> > When Standby Namenode is doing checkpoint, upload the image file to
>> Active
>> > NameNode, the Active NameNode is very slow. What is reason result to the
>> > Active NameNode is slow?
>> >
>> >
>> > Thanks,
>> >
>> > LiuLei
>> >
>>
>>
>>
>> --
>> Harsh J
>>
>
>
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Posted by Jitendra Yadav <je...@gmail.com>.
Hi,
Can you please let me know that how you identified the slowness between
primary and standby namnode?
Also please share the network connection bandwidth between these two
servers.
Thanks
On Tue, Aug 13, 2013 at 11:52 AM, lei liu <li...@gmail.com> wrote:
> The fsimage file size is 1658934155
>
>
> 2013/8/13 Harsh J <ha...@cloudera.com>
>
>> How large are your checkpointed fsimage files?
>>
>> On Mon, Aug 12, 2013 at 3:42 PM, lei liu <li...@gmail.com> wrote:
>> > When Standby Namenode is doing checkpoint, upload the image file to
>> Active
>> > NameNode, the Active NameNode is very slow. What is reason result to the
>> > Active NameNode is slow?
>> >
>> >
>> > Thanks,
>> >
>> > LiuLei
>> >
>>
>>
>>
>> --
>> Harsh J
>>
>
>
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Posted by Jitendra Yadav <je...@gmail.com>.
Hi,
Can you please let me know that how you identified the slowness between
primary and standby namnode?
Also please share the network connection bandwidth between these two
servers.
Thanks
On Tue, Aug 13, 2013 at 11:52 AM, lei liu <li...@gmail.com> wrote:
> The fsimage file size is 1658934155
>
>
> 2013/8/13 Harsh J <ha...@cloudera.com>
>
>> How large are your checkpointed fsimage files?
>>
>> On Mon, Aug 12, 2013 at 3:42 PM, lei liu <li...@gmail.com> wrote:
>> > When Standby Namenode is doing checkpoint, upload the image file to
>> Active
>> > NameNode, the Active NameNode is very slow. What is reason result to the
>> > Active NameNode is slow?
>> >
>> >
>> > Thanks,
>> >
>> > LiuLei
>> >
>>
>>
>>
>> --
>> Harsh J
>>
>
>
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Posted by Jitendra Yadav <je...@gmail.com>.
Hi,
Can you please let me know that how you identified the slowness between
primary and standby namnode?
Also please share the network connection bandwidth between these two
servers.
Thanks
On Tue, Aug 13, 2013 at 11:52 AM, lei liu <li...@gmail.com> wrote:
> The fsimage file size is 1658934155
>
>
> 2013/8/13 Harsh J <ha...@cloudera.com>
>
>> How large are your checkpointed fsimage files?
>>
>> On Mon, Aug 12, 2013 at 3:42 PM, lei liu <li...@gmail.com> wrote:
>> > When Standby Namenode is doing checkpoint, upload the image file to
>> Active
>> > NameNode, the Active NameNode is very slow. What is reason result to the
>> > Active NameNode is slow?
>> >
>> >
>> > Thanks,
>> >
>> > LiuLei
>> >
>>
>>
>>
>> --
>> Harsh J
>>
>
>
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Posted by lei liu <li...@gmail.com>.
The fsimage file size is 1658934155
2013/8/13 Harsh J <ha...@cloudera.com>
> How large are your checkpointed fsimage files?
>
> On Mon, Aug 12, 2013 at 3:42 PM, lei liu <li...@gmail.com> wrote:
> > When Standby Namenode is doing checkpoint, upload the image file to
> Active
> > NameNode, the Active NameNode is very slow. What is reason result to the
> > Active NameNode is slow?
> >
> >
> > Thanks,
> >
> > LiuLei
> >
>
>
>
> --
> Harsh J
>
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Posted by lei liu <li...@gmail.com>.
The fsimage file size is 1658934155
2013/8/13 Harsh J <ha...@cloudera.com>
> How large are your checkpointed fsimage files?
>
> On Mon, Aug 12, 2013 at 3:42 PM, lei liu <li...@gmail.com> wrote:
> > When Standby Namenode is doing checkpoint, upload the image file to
> Active
> > NameNode, the Active NameNode is very slow. What is reason result to the
> > Active NameNode is slow?
> >
> >
> > Thanks,
> >
> > LiuLei
> >
>
>
>
> --
> Harsh J
>
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Posted by lei liu <li...@gmail.com>.
The fsimage file size is 1658934155
2013/8/13 Harsh J <ha...@cloudera.com>
> How large are your checkpointed fsimage files?
>
> On Mon, Aug 12, 2013 at 3:42 PM, lei liu <li...@gmail.com> wrote:
> > When Standby Namenode is doing checkpoint, upload the image file to
> Active
> > NameNode, the Active NameNode is very slow. What is reason result to the
> > Active NameNode is slow?
> >
> >
> > Thanks,
> >
> > LiuLei
> >
>
>
>
> --
> Harsh J
>
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Posted by lei liu <li...@gmail.com>.
The fsimage file size is 1658934155
2013/8/13 Harsh J <ha...@cloudera.com>
> How large are your checkpointed fsimage files?
>
> On Mon, Aug 12, 2013 at 3:42 PM, lei liu <li...@gmail.com> wrote:
> > When Standby Namenode is doing checkpoint, upload the image file to
> Active
> > NameNode, the Active NameNode is very slow. What is reason result to the
> > Active NameNode is slow?
> >
> >
> > Thanks,
> >
> > LiuLei
> >
>
>
>
> --
> Harsh J
>
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Posted by Harsh J <ha...@cloudera.com>.
How large are your checkpointed fsimage files?
On Mon, Aug 12, 2013 at 3:42 PM, lei liu <li...@gmail.com> wrote:
> When Standby Namenode is doing checkpoint, upload the image file to Active
> NameNode, the Active NameNode is very slow. What is reason result to the
> Active NameNode is slow?
>
>
> Thanks,
>
> LiuLei
>
--
Harsh J
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Posted by Harsh J <ha...@cloudera.com>.
How large are your checkpointed fsimage files?
On Mon, Aug 12, 2013 at 3:42 PM, lei liu <li...@gmail.com> wrote:
> When Standby Namenode is doing checkpoint, upload the image file to Active
> NameNode, the Active NameNode is very slow. What is reason result to the
> Active NameNode is slow?
>
>
> Thanks,
>
> LiuLei
>
--
Harsh J
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Posted by Harsh J <ha...@cloudera.com>.
How large are your checkpointed fsimage files?
On Mon, Aug 12, 2013 at 3:42 PM, lei liu <li...@gmail.com> wrote:
> When Standby Namenode is doing checkpoint, upload the image file to Active
> NameNode, the Active NameNode is very slow. What is reason result to the
> Active NameNode is slow?
>
>
> Thanks,
>
> LiuLei
>
--
Harsh J
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Posted by Harsh J <ha...@cloudera.com>.
How large are your checkpointed fsimage files?
On Mon, Aug 12, 2013 at 3:42 PM, lei liu <li...@gmail.com> wrote:
> When Standby Namenode is doing checkpoint, upload the image file to Active
> NameNode, the Active NameNode is very slow. What is reason result to the
> Active NameNode is slow?
>
>
> Thanks,
>
> LiuLei
>
--
Harsh J