You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "genericqa (JIRA)" <ji...@apache.org> on 2018/05/08 00:45:00 UTC

[jira] [Commented] (HADOOP-15449) ZK performance issues causing frequent Namenode failover

    [ https://issues.apache.org/jira/browse/HADOOP-15449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16466685#comment-16466685 ] 

genericqa commented on HADOOP-15449:
------------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 29m 21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 67m 47s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 56s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 26m 33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 26m 33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m  0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 21s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 56s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 14s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 37s{color} | {color:green} The patch does not generate ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}119m 31s{color} | {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | HADOOP-15449 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12922223/HADOOP-15449.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  unit  shadedclient  xml  |
| uname | Linux e2eea1bc6f6f 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 696a4be |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
|  Test Results | https://builds.apache.org/job/PreCommit-HADOOP-Build/14597/testReport/ |
| Max. process+thread count | 1513 (vs. ulimit of 10000) |
| modules | C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common |
| Console output | https://builds.apache.org/job/PreCommit-HADOOP-Build/14597/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> ZK performance issues causing frequent Namenode failover 
> ---------------------------------------------------------
>
>                 Key: HADOOP-15449
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15449
>             Project: Hadoop Common
>          Issue Type: Wish
>          Components: common
>    Affects Versions: 2.7.4
>            Reporter: Karthik Palanisamy
>            Assignee: Karthik Palanisamy
>            Priority: Critical
>         Attachments: HADOOP-15449.patch
>
>
> We observed from several users regarding Namenode flip-over is due to either zookeeper disk slowness (higher fsync cost) or network issue. We would need to avoid flip-over issue to some extent by increasing HA session timeout, ha.zookeeper.session-timeout.ms.
> Default value is 5000 ms, seems very low in any production environment.  I would suggest 10000 ms as default session timeout.
>  
> {code}
> ..
> 2018-05-04 03:54:36,848 INFO  zookeeper.ClientCnxn (ClientCnxn.java:run(1140)) - Client session timed out, have not heard from server in 4689ms for sessionid 0x260e24bac500aa3, closing socket connection and attempting reconnect 
> 2018-05-04 03:56:49,088 INFO  zookeeper.ClientCnxn (ClientCnxn.java:run(1140)) - Client session timed out, have not heard from server in 3981ms for sessionid 0x360fd152b8700fe, closing socket connection and attempting reconnect
> .. 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org