You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hama.apache.org by "Edward J. Yoon (JIRA)" <ji...@apache.org> on 2011/08/23 08:33:31 UTC
[jira] [Commented] (HAMA-413) Remove limitation on the number of
tasks
[ https://issues.apache.org/jira/browse/HAMA-413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089294#comment-13089294 ]
Edward J. Yoon commented on HAMA-413:
-------------------------------------
Below is the results on 16 physical nodes.
{code}
JobClient LOG:
11/08/23 15:27:57 DEBUG bsp.BSPJobClient: BSPJobClient.submitJobDir: hdfs://hnode15:9000/tmp/hadoop-root/bsp/system/submit_22he6c
11/08/23 15:27:58 INFO bsp.BSPJobClient: Running job: job_201108231527_0001
11/08/23 15:28:01 INFO bsp.BSPJobClient: Current supersteps number: 0
11/08/23 15:28:22 INFO bsp.BSPJobClient: The total number of supersteps: 0
java.io.FileNotFoundException: File does not exist: /tmp/pi-example/output
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:457)
at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:676)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1417)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1412)
at org.apache.hama.examples.PiEstimator.printOutput(PiEstimator.java:109)
at org.apache.hama.examples.PiEstimator.main(PiEstimator.java:151)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
at org.apache.hama.examples.ExampleDriver.main(ExampleDriver.java:37)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hama.util.RunJar.main(RunJar.java:145)
----
LOG of node16 groomserver:
2011-08-23 15:28:02,743 INFO org.apache.hama.bsp.TaskRunner: attempt_201108231527_0001_000025_0 11/08/23 15:28:02 WARN bsp.GroomServer: Error running child
2011-08-23 15:28:02,744 INFO org.apache.hama.bsp.TaskRunner: attempt_201108231527_0001_000025_0 java.lang.NullPointerException
2011-08-23 15:28:02,744 INFO org.apache.hama.bsp.TaskRunner: attempt_201108231527_0001_000025_0 at org.apache.hama.bsp.BSPPeer.send(BSPPeer.java:167)
2011-08-23 15:28:02,744 INFO org.apache.hama.bsp.TaskRunner: attempt_201108231527_0001_000025_0 at org.apache.hama.examples.PiEstimator$MyEstimator.bsp(PiEstimator.java:64)
2011-08-23 15:28:02,744 INFO org.apache.hama.bsp.TaskRunner: attempt_201108231527_0001_000025_0 at org.apache.hama.bsp.BSPTask.run(BSPTask.java:60)
2011-08-23 15:28:02,744 INFO org.apache.hama.bsp.TaskRunner: attempt_201108231527_0001_000025_0 at org.apache.hama.bsp.GroomServer$Child.main(GroomServer.java:875)
2011-08-23 15:28:02,744 INFO org.apache.hama.bsp.TaskRunner: attempt_201108231527_0001_000025_0 11/08/23 15:28:02 INFO ipc.Server: Stopping server on 61001
2011-08-23 15:28:02,744 INFO org.apache.hama.bsp.TaskRunner: attempt_201108231527_0001_000025_0 11/08/23 15:28:02 INFO ipc.Server: IPC Server handler 0 on 61001: exiting
2011-08-23 15:28:02,744 INFO org.apache.hama.bsp.TaskRunner: attempt_201108231527_0001_000025_0 11/08/23 15:28:02 INFO ipc.Server: Stopping IPC Server listener on 61001
2011-08-23 15:28:02,744 INFO org.apache.hama.bsp.TaskRunner: attempt_201108231527_0001_000025_0 11/08/23 15:28:02 INFO ipc.Server: Stopping IPC Server Responder
2011-08-23 15:28:02,764 INFO org.apache.hama.bsp.TaskRunner: attempt_201108231527_0001_000039_0 11/08/23 15:28:02 INFO ipc.Server: IPC Server Responder: starting
2011-08-23 15:28:02,764 INFO org.apache.hama.bsp.TaskRunner: attempt_201108231527_0001_000039_0 11/08/23 15:28:02 INFO ipc.Server: IPC Server listener on 61002: starting
...
2011-08-23 15:28:02,951 INFO org.apache.hama.bsp.TaskRunner: attempt_201108231527_0001_000039_0 11/08/23 15:28:02 INFO ipc.Server: Stopping server on 61002
2011-08-23 15:28:02,951 INFO org.apache.hama.bsp.TaskRunner: attempt_201108231527_0001_000039_0 11/08/23 15:28:02 INFO ipc.Server: IPC Server handler 0 on 61002: exiting
2011-08-23 15:28:02,951 INFO org.apache.hama.bsp.TaskRunner: attempt_201108231527_0001_000039_0 11/08/23 15:28:02 INFO ipc.Server: Stopping IPC Server listener on 61002
2011-08-23 15:28:02,951 INFO org.apache.hama.bsp.TaskRunner: attempt_201108231527_0001_000039_0 11/08/23 15:28:02 INFO ipc.Server: Stopping IPC Server Responder
2011-08-23 15:28:03,306 INFO org.apache.hama.bsp.GroomServer: Lost connection to BSP Master [hnode1/10.33.1.101:40000]. Retrying...
java.util.ConcurrentModificationException
at java.util.LinkedHashMap$LinkedHashIterator.nextEntry(LinkedHashMap.java:373)
at java.util.LinkedHashMap$EntryIterator.next(LinkedHashMap.java:392)
at java.util.LinkedHashMap$EntryIterator.next(LinkedHashMap.java:391)
at org.apache.hama.bsp.GroomServer.offerService(GroomServer.java:394)
at org.apache.hama.bsp.GroomServer.run(GroomServer.java:634)
at java.lang.Thread.run(Thread.java:662)
{code}
> Remove limitation on the number of tasks
> ----------------------------------------
>
> Key: HAMA-413
> URL: https://issues.apache.org/jira/browse/HAMA-413
> Project: Hama
> Issue Type: Sub-task
> Components: bsp
> Affects Versions: 0.3.0
> Reporter: Edward J. Yoon
> Assignee: Edward J. Yoon
> Fix For: 0.4.0
>
> Attachments: HAMA-413_v01.patch
>
>
> By HAMA-410 patch, BSPPeer object will be constructed at child process. Now we can just remove limitation on the number of tasks.
> Here's TODO list:
> 1. The number of tasks per groom should be configurable e.g., 'bsp.local.tasks.maximum'.
> 2. The 'totalTaskCapacity' should be calculated at BSPMaster.getClusterStatus().
> 3. When scheduling tasks, consider how to allocate them.
> 4. Each BSPPeer should know all created peers of Hama cluster by job. It can be listed based on actions of GroomServer.
> 5. In examples, 'cluster.getGroomServers()' can be changed to 'cluster.getMaxTasks()'.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira