You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hama.apache.org by "Thomas Jungblut (Created) (JIRA)" <ji...@apache.org> on 2011/10/17 07:30:11 UTC

[jira] [Created] (HAMA-454) Add Zookeeper as synchronization service

Add Zookeeper as synchronization service
----------------------------------------

                 Key: HAMA-454
                 URL: https://issues.apache.org/jira/browse/HAMA-454
             Project: Hama
          Issue Type: Sub-task
            Reporter: Thomas Jungblut




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HAMA-454) Add Zookeeper as synchronization service

Posted by "Thomas Jungblut (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HAMA-454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128743#comment-13128743 ] 

Thomas Jungblut commented on HAMA-454:
--------------------------------------

Boom.
I face lock issues with two tasks:

Task 1
{noformat}
11/10/17 11:15:00 INFO zookeeper.ClientCnxn: Session establishment complete on server localhost/127.0.0.1:21810, sessionid = 0x13311060315000d, negotiated timeout = 1200000
11/10/17 11:15:00 INFO bsp.YarnSerializePrinting$HelloBSP: [Ljava.lang.String;@42787d6a
11/10/17 11:15:00 INFO bsp.YarnSerializePrinting$HelloBSP: Hello BSP from 1 of 2: localhost.localdomain:16002
11/10/17 11:15:01 INFO bsp.BSPPeerImpl: xxxx 1. At superstep: 0 which task is waiting? attempt_appattempt_1318835555330_0018_000001_0000_000001_1 stat is null? null
11/10/17 11:15:02 INFO bsp.BSPPeerImpl: leaveBarrier() !!! checking znodes contnains /ready node or not: at superstep:0 znode:[attempt_appattempt_1318835555330_0018_000001_0000_000001_1, attempt_appattempt_1318835555330_0018_000001_0000_000000_0, ready]
11/10/17 11:15:02 INFO bsp.BSPPeerImpl: leaveBarrier() at superstep:0 znode size: (2) znodes:[attempt_appattempt_1318835555330_0018_000001_0000_000001_1, attempt_appattempt_1318835555330_0018_000001_0000_000000_0]
11/10/17 11:15:02 INFO bsp.BSPPeerImpl: leaveBarrier() at superstep: 0 taskid:attempt_appattempt_1318835555330_0018_000001_0000_000001_1 lowest: attempt_appattempt_1318835555330_0018_000001_0000_000000_0 highest:attempt_appattempt_1318835555330_0018_000001_0000_000001_1
11/10/17 11:15:02 INFO bsp.BSPPeerImpl: leaveBarrier() znode at superstep:0 taskid:attempt_appattempt_1318835555330_0018_000001_0000_000001_1 exists, so delete it.
11/10/17 11:15:02 INFO bsp.BSPPeerImpl: leaveBarrier() !!! checking znodes contnains /ready node or not: at superstep:0 znode:[ready]
11/10/17 11:15:02 INFO bsp.BSPPeerImpl: leaveBarrier() at superstep:0 znode size: (0) znodes:[]
{noformat}

Task 2
{noformat}
11/10/17 11:15:00 INFO bsp.YarnSerializePrinting$HelloBSP: [Ljava.lang.String;@df4cbee
11/10/17 11:15:02 INFO bsp.BSPPeerImpl: leaveBarrier() !!! checking znodes contnains /ready node or not: at superstep:0 znode:[attempt_appattempt_1318835555330_0018_000001_0000_000001_1, attempt_appattempt_1318835555330_0018_000001_0000_000000_0, ready]
11/10/17 11:15:02 INFO bsp.BSPPeerImpl: leaveBarrier() at superstep:0 znode size: (2) znodes:[attempt_appattempt_1318835555330_0018_000001_0000_000001_1, attempt_appattempt_1318835555330_0018_000001_0000_000000_0]
11/10/17 11:15:02 INFO bsp.BSPPeerImpl: leaveBarrier() at superstep: 0 taskid:attempt_appattempt_1318835555330_0018_000001_0000_000000_0 lowest: attempt_appattempt_1318835555330_0018_000001_0000_000000_0 highest:attempt_appattempt_1318835555330_0018_000001_0000_000001_1
11/10/17 11:15:02 INFO bsp.BSPPeerImpl: leaveBarrier() !!! checking znodes contnains /ready node or not: at superstep:0 znode:[attempt_appattempt_1318835555330_0018_000001_0000_000000_0, ready]
11/10/17 11:15:02 INFO bsp.BSPPeerImpl: leaveBarrier() at superstep:0 znode size: (1) znodes:[attempt_appattempt_1318835555330_0018_000001_0000_000000_0]
{noformat}

And it hangs forever.

I use the app attempt id as the znode and for each task I make a znode with the host:port pair. 
Do you know what I made wrong?
I provide you with the patch.
                
> Add Zookeeper as synchronization service
> ----------------------------------------
>
>                 Key: HAMA-454
>                 URL: https://issues.apache.org/jira/browse/HAMA-454
>             Project: Hama
>          Issue Type: Sub-task
>            Reporter: Thomas Jungblut
>
> We should use Zookeeper instead of our own implementation.
> Additionally we should use the plain BSPPeerImpl in YARN to reduce duplicate code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HAMA-454) Add Zookeeper as synchronization service

Posted by "Thomas Jungblut (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HAMA-454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thomas Jungblut updated HAMA-454:
---------------------------------

    Attachment: HAMA-454_v3.patch

Now with try/catch and debug level log output.

{noformat}
2011-10-23 16:10:49,699 INFO  bsp.YARNBSPJob (YARNBSPJob.java:waitForCompletion(275)) - Job succeeded!
{noformat}
                
> Add Zookeeper as synchronization service
> ----------------------------------------
>
>                 Key: HAMA-454
>                 URL: https://issues.apache.org/jira/browse/HAMA-454
>             Project: Hama
>          Issue Type: Sub-task
>            Reporter: Thomas Jungblut
>            Assignee: Thomas Jungblut
>         Attachments: HAMA-454.patch, HAMA-454.patch, HAMA-454_v2.patch, HAMA-454_v3.patch
>
>
> We should use Zookeeper instead of our own implementation.
> Additionally we should use the plain BSPPeerImpl in YARN to reduce duplicate code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HAMA-454) Add Zookeeper as synchronization service

Posted by "Edward J. Yoon (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HAMA-454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13134669#comment-13134669 ] 

Edward J. Yoon commented on HAMA-454:
-------------------------------------

Here's my quick test results:

{code}

root@Cnode1:/usr/local/src/hama-trunk# core/bin/hama jar examples/target/hama-examples-0.4.0-incubating-SNAPSHOT.jar pi
11/10/25 09:51:06 INFO bsp.BSPJobClient: Running job: job_201110250950_0001
11/10/25 09:51:09 INFO bsp.BSPJobClient: Current supersteps number: 0
11/10/25 09:51:15 INFO bsp.BSPJobClient: Current supersteps number: 1
11/10/25 09:51:18 INFO bsp.BSPJobClient: The total number of supersteps: 1
Estimated value of PI is 3.1417666666666673
Job Finished in 12.914 seconds
root@Cnode1:/usr/local/src/hama-trunk# core/bin/stop-bspd.sh
stopping bspmaster
hnode3: stopping groom
hnode2: stopping groom
hnode1: stopping groom
hnode4: stopping groom
hnode1: stopping zookeeper
root@Cnode1:/usr/local/src/hama-trunk# svn stat
?       HAMA-454_v3.patch
M       core/conf/hama-env.sh
M       core/conf/hama-site.xml
M       core/conf/groomservers
root@Cnode1:/usr/local/src/hama-trunk# patch -p0 < HAMA-454_v3.patch
patching file core/src/main/java/org/apache/hama/zookeeper/QuorumPeer.java
patching file core/src/main/java/org/apache/hama/bsp/BSPPeerImpl.java
patching file core/src/main/java/org/apache/hama/bsp/sync/SyncServer.java
patching file core/src/main/java/org/apache/hama/bsp/sync/zookeeper/ZooKeeperSyncClientImpl.java
patching file core/src/main/java/org/apache/hama/bsp/sync/zookeeper/ZooKeeperSyncServerImpl.java
patching file core/src/main/java/org/apache/hama/bsp/sync/rpc/RPCSyncClientImpl.java
patching file core/src/main/java/org/apache/hama/bsp/sync/rpc/RPCSyncServerImpl.java
patching file core/src/main/java/org/apache/hama/bsp/sync/SyncServerRunner.java
patching file yarn/src/main/java/org/apache/hama/bsp/BSPClient.java
patching file yarn/src/main/java/org/apache/hama/bsp/BSPApplicationMaster.java
patching file yarn/src/main/java/org/apache/hama/bsp/YARNBSPJob.java
patching file yarn/src/main/java/org/apache/hama/bsp/sync/SyncServer.java
patching file yarn/src/main/java/org/apache/hama/bsp/sync/SyncServerImpl.java
patching file yarn/src/main/java/org/apache/hama/bsp/YARNBSPPeerImpl.java
patching file yarn/src/main/java/org/apache/hama/bsp/JobImpl.java
patching file yarn/src/main/java/org/apache/hama/bsp/YarnSerializePrinting.java
patching file yarn/src/main/java/org/apache/hama/bsp/BSPRunner.java

root@Cnode1:/usr/local/src/hama-trunk# mvn clean install package -Dmaven.test.skip=true
....

root@Cnode1:/usr/local/src/hama-trunk# core/bin/start-bspd.sh
hnode1: starting zookeeper, logging to /usr/local/src/hama-trunk/core/bin/../logs/hama-root-zookeeper-Cnode1.out
starting bspmaster, logging to /usr/local/src/hama-trunk/core/bin/../logs/hama-root-bspmaster-Cnode1.out
2011-10-25 09:54:04.190::INFO:  Logging to STDERR via org.mortbay.log.StdErrLog
2011-10-25 09:54:04.232::INFO:  jetty-6.1.14
hnode3: starting groom, logging to /usr/local/src/hama-trunk/core/bin/../logs/hama-root-groom-Cnode3.out
hnode1: starting groom, logging to /usr/local/src/hama-trunk/core/bin/../logs/hama-root-groom-Cnode1.out
hnode4: starting groom, logging to /usr/local/src/hama-trunk/core/bin/../logs/hama-root-groom-Cnode4.out
hnode2: starting groom, logging to /usr/local/src/hama-trunk/core/bin/../logs/hama-root-groom-Cnode2.out

root@Cnode1:/usr/local/src/hama-trunk# core/bin/hama jar examples/target/hama-examples-0.4.0-incubating-SNAPSHOT.jar pi
11/10/25 09:54:29 INFO bsp.BSPJobClient: Running job: job_201110250954_0001
11/10/25 09:54:32 INFO bsp.BSPJobClient: Current supersteps number: 0
............................................................

GroomServer LOG:

2011-10-25 09:55:48,130 INFO org.apache.hama.bsp.GroomServer: Starting groom: cnode1.ucloud:50000
2011-10-25 09:56:15,283 WARN org.apache.hama.bsp.GroomServer: Error initializing attempt_201110250955_0001_000002_0:
java.lang.NullPointerException
        at org.apache.hama.bsp.GroomServer.localizeJob(GroomServer.java:548)
        at org.apache.hama.bsp.GroomServer.startNewTask(GroomServer.java:488)
        at org.apache.hama.bsp.GroomServer.access$000(GroomServer.java:76)
        at org.apache.hama.bsp.GroomServer$DispatchTasksHandler.handle(GroomServer.java:165)
        at org.apache.hama.bsp.GroomServer$Instructor.run(GroomServer.java:210)

2011-10-25 09:56:15,315 WARN org.apache.hama.bsp.GroomServer: Error initializing attempt_201110250955_0001_000008_0:
java.lang.NullPointerException
        at org.apache.hama.bsp.GroomServer.localizeJob(GroomServer.java:548)
        at org.apache.hama.bsp.GroomServer.startNewTask(GroomServer.java:488)
        at org.apache.hama.bsp.GroomServer.access$000(GroomServer.java:76)
        at org.apache.hama.bsp.GroomServer$DispatchTasksHandler.handle(GroomServer.java:165)
        at org.apache.hama.bsp.GroomServer$Instructor.run(GroomServer.java:210)

2011-10-25 09:56:18,151 INFO org.apache.hama.bsp.TaskRunner: Start building BSPPeer process.

{code}
                
> Add Zookeeper as synchronization service
> ----------------------------------------
>
>                 Key: HAMA-454
>                 URL: https://issues.apache.org/jira/browse/HAMA-454
>             Project: Hama
>          Issue Type: Sub-task
>            Reporter: Thomas Jungblut
>            Assignee: Thomas Jungblut
>         Attachments: HAMA-454.patch, HAMA-454.patch, HAMA-454_v2.patch, HAMA-454_v3.patch
>
>
> We should use Zookeeper instead of our own implementation.
> Additionally we should use the plain BSPPeerImpl in YARN to reduce duplicate code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HAMA-454) Add Zookeeper as synchronization service

Posted by "Thomas Jungblut (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HAMA-454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thomas Jungblut updated HAMA-454:
---------------------------------

    Attachment: HAMA-454_v2.patch

Small adjustments, but still hangs
                
> Add Zookeeper as synchronization service
> ----------------------------------------
>
>                 Key: HAMA-454
>                 URL: https://issues.apache.org/jira/browse/HAMA-454
>             Project: Hama
>          Issue Type: Sub-task
>            Reporter: Thomas Jungblut
>         Attachments: HAMA-454.patch, HAMA-454_v2.patch
>
>
> We should use Zookeeper instead of our own implementation.
> Additionally we should use the plain BSPPeerImpl in YARN to reduce duplicate code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HAMA-454) Add Zookeeper as synchronization service

Posted by "Thomas Jungblut (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HAMA-454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thomas Jungblut updated HAMA-454:
---------------------------------

    Description: 
We should use Zookeeper instead of our own implementation.
Additionally we should use the plain BSPPeerImpl in YARN to reduce duplicate code.
    
> Add Zookeeper as synchronization service
> ----------------------------------------
>
>                 Key: HAMA-454
>                 URL: https://issues.apache.org/jira/browse/HAMA-454
>             Project: Hama
>          Issue Type: Sub-task
>            Reporter: Thomas Jungblut
>
> We should use Zookeeper instead of our own implementation.
> Additionally we should use the plain BSPPeerImpl in YARN to reduce duplicate code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HAMA-454) Add Zookeeper as synchronization service

Posted by "Thomas Jungblut (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HAMA-454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thomas Jungblut updated HAMA-454:
---------------------------------

    Attachment: HAMA-454.patch
    
> Add Zookeeper as synchronization service
> ----------------------------------------
>
>                 Key: HAMA-454
>                 URL: https://issues.apache.org/jira/browse/HAMA-454
>             Project: Hama
>          Issue Type: Sub-task
>            Reporter: Thomas Jungblut
>         Attachments: HAMA-454.patch
>
>
> We should use Zookeeper instead of our own implementation.
> Additionally we should use the plain BSPPeerImpl in YARN to reduce duplicate code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HAMA-454) Add Zookeeper as synchronization service

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HAMA-454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144437#comment-13144437 ] 

Hudson commented on HAMA-454:
-----------------------------

Integrated in Hama-Nightly #343 (See [https://builds.apache.org/job/Hama-Nightly/343/])
    [HAMA-454] Add Zookeeper as synchronization service

tjungblut : 
Files : 
* /incubator/hama/trunk/CHANGES.txt
* /incubator/hama/trunk/core/src/main/java/org/apache/hama/bsp/BSPPeerImpl.java
* /incubator/hama/trunk/core/src/main/java/org/apache/hama/bsp/sync/SyncServer.java
* /incubator/hama/trunk/core/src/main/java/org/apache/hama/bsp/sync/SyncServerRunner.java
* /incubator/hama/trunk/core/src/main/java/org/apache/hama/bsp/sync/ZooKeeperSyncClientImpl.java
* /incubator/hama/trunk/core/src/main/java/org/apache/hama/bsp/sync/ZooKeeperSyncServerImpl.java
* /incubator/hama/trunk/core/src/main/java/org/apache/hama/zookeeper/QuorumPeer.java
* /incubator/hama/trunk/pom.xml
* /incubator/hama/trunk/yarn/src/main/java/org/apache/hama/bsp/BSPApplicationMaster.java
* /incubator/hama/trunk/yarn/src/main/java/org/apache/hama/bsp/BSPClient.java
* /incubator/hama/trunk/yarn/src/main/java/org/apache/hama/bsp/BSPRunner.java
* /incubator/hama/trunk/yarn/src/main/java/org/apache/hama/bsp/JobImpl.java
* /incubator/hama/trunk/yarn/src/main/java/org/apache/hama/bsp/YARNBSPJob.java
* /incubator/hama/trunk/yarn/src/main/java/org/apache/hama/bsp/YARNBSPPeerImpl.java
* /incubator/hama/trunk/yarn/src/main/java/org/apache/hama/bsp/YarnSerializePrinting.java
* /incubator/hama/trunk/yarn/src/main/java/org/apache/hama/bsp/sync/SyncServer.java
* /incubator/hama/trunk/yarn/src/main/java/org/apache/hama/bsp/sync/SyncServerImpl.java

                
> Add Zookeeper as synchronization service
> ----------------------------------------
>
>                 Key: HAMA-454
>                 URL: https://issues.apache.org/jira/browse/HAMA-454
>             Project: Hama
>          Issue Type: Sub-task
>            Reporter: Thomas Jungblut
>            Assignee: Thomas Jungblut
>             Fix For: 0.4.0
>
>         Attachments: HAMA-454.patch, HAMA-454.patch, HAMA-454_v2.patch, HAMA-454_v3.patch
>
>
> We should use Zookeeper instead of our own implementation.
> Additionally we should use the plain BSPPeerImpl in YARN to reduce duplicate code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (HAMA-454) Add Zookeeper as synchronization service

Posted by "Thomas Jungblut (Assigned) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HAMA-454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thomas Jungblut reassigned HAMA-454:
------------------------------------

    Assignee: Thomas Jungblut
    
> Add Zookeeper as synchronization service
> ----------------------------------------
>
>                 Key: HAMA-454
>                 URL: https://issues.apache.org/jira/browse/HAMA-454
>             Project: Hama
>          Issue Type: Sub-task
>            Reporter: Thomas Jungblut
>            Assignee: Thomas Jungblut
>         Attachments: HAMA-454.patch, HAMA-454_v2.patch
>
>
> We should use Zookeeper instead of our own implementation.
> Additionally we should use the plain BSPPeerImpl in YARN to reduce duplicate code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HAMA-454) Add Zookeeper as synchronization service

Posted by "Edward J. Yoon (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HAMA-454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13134671#comment-13134671 ] 

Edward J. Yoon commented on HAMA-454:
-------------------------------------

It looks like a problem with HDFS. Anyway, Patch looks good +1!
                
> Add Zookeeper as synchronization service
> ----------------------------------------
>
>                 Key: HAMA-454
>                 URL: https://issues.apache.org/jira/browse/HAMA-454
>             Project: Hama
>          Issue Type: Sub-task
>            Reporter: Thomas Jungblut
>            Assignee: Thomas Jungblut
>         Attachments: HAMA-454.patch, HAMA-454.patch, HAMA-454_v2.patch, HAMA-454_v3.patch
>
>
> We should use Zookeeper instead of our own implementation.
> Additionally we should use the plain BSPPeerImpl in YARN to reduce duplicate code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HAMA-454) Add Zookeeper as synchronization service

Posted by "Thomas Jungblut (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HAMA-454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13129689#comment-13129689 ] 

Thomas Jungblut commented on HAMA-454:
--------------------------------------

Let's wait with this task until we refactored the BSPPeer correctly and extracted the sync interfaces.
                
> Add Zookeeper as synchronization service
> ----------------------------------------
>
>                 Key: HAMA-454
>                 URL: https://issues.apache.org/jira/browse/HAMA-454
>             Project: Hama
>          Issue Type: Sub-task
>            Reporter: Thomas Jungblut
>            Assignee: Thomas Jungblut
>         Attachments: HAMA-454.patch, HAMA-454_v2.patch
>
>
> We should use Zookeeper instead of our own implementation.
> Additionally we should use the plain BSPPeerImpl in YARN to reduce duplicate code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HAMA-454) Add Zookeeper as synchronization service

Posted by "Thomas Jungblut (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HAMA-454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144154#comment-13144154 ] 

Thomas Jungblut commented on HAMA-454:
--------------------------------------

After a lot of merging I am going to commit this now.

I/O System is currently not supported by YARN.
                
> Add Zookeeper as synchronization service
> ----------------------------------------
>
>                 Key: HAMA-454
>                 URL: https://issues.apache.org/jira/browse/HAMA-454
>             Project: Hama
>          Issue Type: Sub-task
>            Reporter: Thomas Jungblut
>            Assignee: Thomas Jungblut
>         Attachments: HAMA-454.patch, HAMA-454.patch, HAMA-454_v2.patch, HAMA-454_v3.patch
>
>
> We should use Zookeeper instead of our own implementation.
> Additionally we should use the plain BSPPeerImpl in YARN to reduce duplicate code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (HAMA-454) Add Zookeeper as synchronization service

Posted by "Thomas Jungblut (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HAMA-454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thomas Jungblut resolved HAMA-454.
----------------------------------

       Resolution: Fixed
    Fix Version/s: 0.4.0
    
> Add Zookeeper as synchronization service
> ----------------------------------------
>
>                 Key: HAMA-454
>                 URL: https://issues.apache.org/jira/browse/HAMA-454
>             Project: Hama
>          Issue Type: Sub-task
>            Reporter: Thomas Jungblut
>            Assignee: Thomas Jungblut
>             Fix For: 0.4.0
>
>         Attachments: HAMA-454.patch, HAMA-454.patch, HAMA-454_v2.patch, HAMA-454_v3.patch
>
>
> We should use Zookeeper instead of our own implementation.
> Additionally we should use the plain BSPPeerImpl in YARN to reduce duplicate code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HAMA-454) Add Zookeeper as synchronization service

Posted by "Thomas Jungblut (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HAMA-454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thomas Jungblut updated HAMA-454:
---------------------------------

    Attachment: HAMA-454.patch

Zookeeper now works (theoretically).

Sometimes I still have shutdown issues in the  ApplicationMaster:

{noformat}

11/10/21 18:30:57 INFO server.NIOServerCnxn: NIOServerCnxn factory exited run method
11/10/21 18:30:57 INFO server.PrepRequestProcessor: PrepRequestProcessor exited loop!
11/10/21 18:30:57 INFO server.SyncRequestProcessor: SyncRequestProcessor exited!
11/10/21 18:30:57 INFO server.FinalRequestProcessor: shutdown of request processor complete
11/10/21 18:30:57 INFO server.FinalRequestProcessor: shutdown of request processor complete
11/10/21 18:30:57 WARN jmx.MBeanRegistry: Failed to unregister MBean InMemoryDataTree
11/10/21 18:30:57 WARN jmx.MBeanRegistry: Error during unregister
javax.management.InstanceNotFoundException: org.apache.ZooKeeperService:name0=StandaloneServer_port-1,name1=InMemoryDataTree
        at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1094)
        at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.exclusiveUnregisterMBean(DefaultMBeanServerInterceptor.java:415)
        at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.unregisterMBean(DefaultMBeanServerInterceptor.java:403)
        at com.sun.jmx.mbeanserver.JmxMBeanServer.unregisterMBean(JmxMBeanServer.java:506)
        at org.apache.zookeeper.jmx.MBeanRegistry.unregister(MBeanRegistry.java:94)
        at org.apache.zookeeper.jmx.MBeanRegistry.unregister(MBeanRegistry.java:111)
        at org.apache.zookeeper.server.ZooKeeperServer.unregisterJMX(ZooKeeperServer.java:421)
        at org.apache.zookeeper.server.ZooKeeperServer.shutdown(ZooKeeperServer.java:414)
        at org.apache.zookeeper.server.NIOServerCnxn$Factory.shutdown(NIOServerCnxn.java:323)
        at org.apache.zookeeper.server.ZooKeeperServerMain.shutdown(ZooKeeperServerMain.java:125)
        at org.apache.hama.zookeeper.QuorumPeer$ShutdownableZooKeeperServerMain.shutdown(QuorumPeer.java:407)
        at org.apache.hama.zookeeper.QuorumPeer$ShutdownableZooKeeperServerMain.shutdownZookeeperMain(QuorumPeer.java:402)
        at org.apache.hama.bsp.sync.zookeeper.ZooKeeperSyncServerImpl.stopServer(ZooKeeperSyncServerImpl.java:82)
        at org.apache.hama.bsp.sync.SyncServerRunner.stop(SyncServerRunner.java:47)
        at org.apache.hama.bsp.BSPApplicationMaster.cleanup(BSPApplicationMaster.java:243)
        at org.apache.hama.bsp.BSPApplicationMaster.main(BSPApplicationMaster.java:279)
{noformat}

Besides this everything works.
Even the RPC Synchronization works, but we have to manually build core module with Hadoop 23.0.
                
> Add Zookeeper as synchronization service
> ----------------------------------------
>
>                 Key: HAMA-454
>                 URL: https://issues.apache.org/jira/browse/HAMA-454
>             Project: Hama
>          Issue Type: Sub-task
>            Reporter: Thomas Jungblut
>            Assignee: Thomas Jungblut
>         Attachments: HAMA-454.patch, HAMA-454.patch, HAMA-454_v2.patch
>
>
> We should use Zookeeper instead of our own implementation.
> Additionally we should use the plain BSPPeerImpl in YARN to reduce duplicate code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HAMA-454) Add Zookeeper as synchronization service

Posted by "Thomas Jungblut (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HAMA-454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132811#comment-13132811 ] 

Thomas Jungblut commented on HAMA-454:
--------------------------------------

We can always catch and swallow the exception. But this wouldn't be such a good way to solve this.
                
> Add Zookeeper as synchronization service
> ----------------------------------------
>
>                 Key: HAMA-454
>                 URL: https://issues.apache.org/jira/browse/HAMA-454
>             Project: Hama
>          Issue Type: Sub-task
>            Reporter: Thomas Jungblut
>            Assignee: Thomas Jungblut
>         Attachments: HAMA-454.patch, HAMA-454.patch, HAMA-454_v2.patch
>
>
> We should use Zookeeper instead of our own implementation.
> Additionally we should use the plain BSPPeerImpl in YARN to reduce duplicate code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (HAMA-454) Add Zookeeper as synchronization service

Posted by "Thomas Jungblut (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HAMA-454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132803#comment-13132803 ] 

Thomas Jungblut edited comment on HAMA-454 at 10/21/11 4:34 PM:
----------------------------------------------------------------

Zookeeper now works (theoretically).

Sometimes I still have shutdown issues in the  ApplicationMaster:

{noformat}

11/10/21 18:30:57 INFO server.NIOServerCnxn: NIOServerCnxn factory exited run method
11/10/21 18:30:57 INFO server.PrepRequestProcessor: PrepRequestProcessor exited loop!
11/10/21 18:30:57 INFO server.SyncRequestProcessor: SyncRequestProcessor exited!
11/10/21 18:30:57 INFO server.FinalRequestProcessor: shutdown of request processor complete
11/10/21 18:30:57 INFO server.FinalRequestProcessor: shutdown of request processor complete
11/10/21 18:30:57 WARN jmx.MBeanRegistry: Failed to unregister MBean InMemoryDataTree
11/10/21 18:30:57 WARN jmx.MBeanRegistry: Error during unregister
javax.management.InstanceNotFoundException: org.apache.ZooKeeperService:name0=StandaloneServer_port-1,name1=InMemoryDataTree
        at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1094)
        at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.exclusiveUnregisterMBean(DefaultMBeanServerInterceptor.java:415)
        at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.unregisterMBean(DefaultMBeanServerInterceptor.java:403)
        at com.sun.jmx.mbeanserver.JmxMBeanServer.unregisterMBean(JmxMBeanServer.java:506)
        at org.apache.zookeeper.jmx.MBeanRegistry.unregister(MBeanRegistry.java:94)
        at org.apache.zookeeper.jmx.MBeanRegistry.unregister(MBeanRegistry.java:111)
        at org.apache.zookeeper.server.ZooKeeperServer.unregisterJMX(ZooKeeperServer.java:421)
        at org.apache.zookeeper.server.ZooKeeperServer.shutdown(ZooKeeperServer.java:414)
        at org.apache.zookeeper.server.NIOServerCnxn$Factory.shutdown(NIOServerCnxn.java:323)
        at org.apache.zookeeper.server.ZooKeeperServerMain.shutdown(ZooKeeperServerMain.java:125)
        at org.apache.hama.zookeeper.QuorumPeer$ShutdownableZooKeeperServerMain.shutdown(QuorumPeer.java:407)
        at org.apache.hama.zookeeper.QuorumPeer$ShutdownableZooKeeperServerMain.shutdownZookeeperMain(QuorumPeer.java:402)
        at org.apache.hama.bsp.sync.zookeeper.ZooKeeperSyncServerImpl.stopServer(ZooKeeperSyncServerImpl.java:82)
        at org.apache.hama.bsp.sync.SyncServerRunner.stop(SyncServerRunner.java:47)
        at org.apache.hama.bsp.BSPApplicationMaster.cleanup(BSPApplicationMaster.java:243)
        at org.apache.hama.bsp.BSPApplicationMaster.main(BSPApplicationMaster.java:279)
{noformat}

Besides this everything works.
Even the RPC Synchronization works, but we have to manually build core module with Hadoop 23.0.

Unit tests complete as well:

{noformat}
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Hama parent POM ............................ SUCCESS [3.121s]
[INFO] Apache Hama Core .................................. SUCCESS [11.107s]
[INFO] Apache Hama Graph Package ......................... SUCCESS [2.572s]
[INFO] Apache Hama Examples .............................. SUCCESS [25.469s]
[INFO] Apache Hama YARN .................................. SUCCESS [46.075s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 1:30.192s
[INFO] Finished at: Fri Oct 21 18:29:10 CEST 2011
[INFO] Final Memory: 35M/353M
[INFO] ------------------------------------------------------------------------

{noformat}
                
      was (Author: thomas.jungblut):
    Zookeeper now works (theoretically).

Sometimes I still have shutdown issues in the  ApplicationMaster:

{noformat}

11/10/21 18:30:57 INFO server.NIOServerCnxn: NIOServerCnxn factory exited run method
11/10/21 18:30:57 INFO server.PrepRequestProcessor: PrepRequestProcessor exited loop!
11/10/21 18:30:57 INFO server.SyncRequestProcessor: SyncRequestProcessor exited!
11/10/21 18:30:57 INFO server.FinalRequestProcessor: shutdown of request processor complete
11/10/21 18:30:57 INFO server.FinalRequestProcessor: shutdown of request processor complete
11/10/21 18:30:57 WARN jmx.MBeanRegistry: Failed to unregister MBean InMemoryDataTree
11/10/21 18:30:57 WARN jmx.MBeanRegistry: Error during unregister
javax.management.InstanceNotFoundException: org.apache.ZooKeeperService:name0=StandaloneServer_port-1,name1=InMemoryDataTree
        at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1094)
        at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.exclusiveUnregisterMBean(DefaultMBeanServerInterceptor.java:415)
        at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.unregisterMBean(DefaultMBeanServerInterceptor.java:403)
        at com.sun.jmx.mbeanserver.JmxMBeanServer.unregisterMBean(JmxMBeanServer.java:506)
        at org.apache.zookeeper.jmx.MBeanRegistry.unregister(MBeanRegistry.java:94)
        at org.apache.zookeeper.jmx.MBeanRegistry.unregister(MBeanRegistry.java:111)
        at org.apache.zookeeper.server.ZooKeeperServer.unregisterJMX(ZooKeeperServer.java:421)
        at org.apache.zookeeper.server.ZooKeeperServer.shutdown(ZooKeeperServer.java:414)
        at org.apache.zookeeper.server.NIOServerCnxn$Factory.shutdown(NIOServerCnxn.java:323)
        at org.apache.zookeeper.server.ZooKeeperServerMain.shutdown(ZooKeeperServerMain.java:125)
        at org.apache.hama.zookeeper.QuorumPeer$ShutdownableZooKeeperServerMain.shutdown(QuorumPeer.java:407)
        at org.apache.hama.zookeeper.QuorumPeer$ShutdownableZooKeeperServerMain.shutdownZookeeperMain(QuorumPeer.java:402)
        at org.apache.hama.bsp.sync.zookeeper.ZooKeeperSyncServerImpl.stopServer(ZooKeeperSyncServerImpl.java:82)
        at org.apache.hama.bsp.sync.SyncServerRunner.stop(SyncServerRunner.java:47)
        at org.apache.hama.bsp.BSPApplicationMaster.cleanup(BSPApplicationMaster.java:243)
        at org.apache.hama.bsp.BSPApplicationMaster.main(BSPApplicationMaster.java:279)
{noformat}

Besides this everything works.
Even the RPC Synchronization works, but we have to manually build core module with Hadoop 23.0.
                  
> Add Zookeeper as synchronization service
> ----------------------------------------
>
>                 Key: HAMA-454
>                 URL: https://issues.apache.org/jira/browse/HAMA-454
>             Project: Hama
>          Issue Type: Sub-task
>            Reporter: Thomas Jungblut
>            Assignee: Thomas Jungblut
>         Attachments: HAMA-454.patch, HAMA-454.patch, HAMA-454_v2.patch
>
>
> We should use Zookeeper instead of our own implementation.
> Additionally we should use the plain BSPPeerImpl in YARN to reduce duplicate code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira