You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Mark Kerzner <ma...@gmail.com> on 2009/09/03 18:46:43 UTC

Pregel

Hi, guys,

Pregel has been revealed on 8/11, what is your opinion of, does anybody know
how to get the presentation, and is anyone interested in implementing it?

Thank you,
Mark

Re: Pregel

Posted by Amandeep Khurana <am...@gmail.com>.

On Thu, Sep 3, 2009 at 6:55 PM, Ted Dunning <te...@gmail.com> wrote:

> You would be entirely welcome in Mahout.   Graph based algorithms are key
> for lots of kinds of interesting learning and would be a fabulous thing to
> have in a comprehensive substrate.
>
> I personally would also be very interested in learning more about about
> what
> sorts of things Pregel is doing.  It is relatively easy to build simple
> graph algorithms on top of Map-reduce, but these algorithms typically
> require a map-reduce iteration to propagate information.  Good algorithms
> for that architecture have exponential propagation so that you don't need a
> huge number of iterations.  It smelled like Pregel was doing something much
> more interesting.
>

I second that.. Looks like Pregel is something more than just propagation.
However, the paper is still not available on ACM's website. So, we dont know
any details yet... Lets just wait for that to be available before talking
more about it.


>
> On Thu, Sep 3, 2009 at 4:45 PM, Mark Kerzner <ma...@gmail.com>
> wrote:
>
> > But Ted,
> > I am interested specifically in Pregel kind of system, for distributed
> > graph
> > operations, and Mahout is for distributed learning. Here what I would
> > ideally like to do:
> >
> >
> >   - Somebody must have info on Pregel - it's out, has been presented, and
> >   the information is public. Anybody has been there and can at least
> > re-tell?
> >   The subscriptions are expensive, but I could buy one copy of the
> article,
> > if
> >   it is available. I will also write directly to the authors;
> >   - Study that and discuss relevant information and architecture;
> >   - Do the first implementation.
> >
> > I personally don't like the name Hamburg, but I could live with that.
> >
> >
>

Re: Pregel

Posted by Steve Loughran <st...@apache.org>.

Ted Dunning wrote:
> You would be entirely welcome in Mahout.   Graph based algorithms are key
> for lots of kinds of interesting learning and would be a fabulous thing to
> have in a comprehensive substrate.
> 
> I personally would also be very interested in learning more about about what
> sorts of things Pregel is doing.  It is relatively easy to build simple
> graph algorithms on top of Map-reduce, but these algorithms typically
> require a map-reduce iteration to propagate information.  Good algorithms
> for that architecture have exponential propagation so that you don't need a
> huge number of iterations.  It smelled like Pregel was doing something much
> more interesting.
> 

Exactly, it is not pushing bits of the graph around. Instead it has 
partitioned the graph to different machines, and is pushing the work out 
to the relevant bits of the graph, a sort of GraphReduce. I believe, not 
having seen the code myself :)

Re: Pregel

Posted by Ted Dunning <te...@gmail.com>.

You would be entirely welcome in Mahout.   Graph based algorithms are key
for lots of kinds of interesting learning and would be a fabulous thing to
have in a comprehensive substrate.

I personally would also be very interested in learning more about about what
sorts of things Pregel is doing.  It is relatively easy to build simple
graph algorithms on top of Map-reduce, but these algorithms typically
require a map-reduce iteration to propagate information.  Good algorithms
for that architecture have exponential propagation so that you don't need a
huge number of iterations.  It smelled like Pregel was doing something much
more interesting.

On Thu, Sep 3, 2009 at 4:45 PM, Mark Kerzner <ma...@gmail.com> wrote:

> But Ted,
> I am interested specifically in Pregel kind of system, for distributed
> graph
> operations, and Mahout is for distributed learning. Here what I would
> ideally like to do:
>
>
>   - Somebody must have info on Pregel - it's out, has been presented, and
>   the information is public. Anybody has been there and can at least
> re-tell?
>   The subscriptions are expensive, but I could buy one copy of the article,
> if
>   it is available. I will also write directly to the authors;
>   - Study that and discuss relevant information and architecture;
>   - Do the first implementation.
>
> I personally don't like the name Hamburg, but I could live with that.
>
>

Re: Pregel

Posted by Ted Dunning <te...@gmail.com>.

Are there any production applications that use Hama?

On Thu, Sep 3, 2009 at 7:07 PM, Edward J. Yoon <ed...@apache.org>wrote:

> Just FYI, Hama (Hadoop Matrix, http://incubator.apache.org/hama) also
> consider adopting this computing model based on bulk synchronous
> parallel.
>
> On Fri, Sep 4, 2009 at 9:57 AM, Edward J. Yoon<ed...@apache.org>
> wrote:
> > We've already made a prototype of Hamburg based on multi thread. It's
> > a BSP based graph computing framework, not a M/R based application.
> >
> > Please Join to ... http://groups.google.com/group/hamburg-dev
>



-- 
Ted Dunning, CTO
DeepDyve

Re: Pregel

Posted by "Edward J. Yoon" <ed...@apache.org>.

Just FYI, Hama (Hadoop Matrix, http://incubator.apache.org/hama) also
consider adopting this computing model based on bulk synchronous
parallel.

On Fri, Sep 4, 2009 at 9:57 AM, Edward J. Yoon<ed...@apache.org> wrote:
> We've already made a prototype of Hamburg based on multi thread. It's
> a BSP based graph computing framework, not a M/R based application.
>
> Please Join to ... http://groups.google.com/group/hamburg-dev
>
> On Fri, Sep 4, 2009 at 8:45 AM, Mark Kerzner<ma...@gmail.com> wrote:
>> But Ted,
>> I am interested specifically in Pregel kind of system, for distributed graph
>> operations, and Mahout is for distributed learning. Here what I would
>> ideally like to do:
>>
>>
>>   - Somebody must have info on Pregel - it's out, has been presented, and
>>   the information is public. Anybody has been there and can at least re-tell?
>>   The subscriptions are expensive, but I could buy one copy of the article, if
>>   it is available. I will also write directly to the authors;
>>   - Study that and discuss relevant information and architecture;
>>   - Do the first implementation.
>>
>> I personally don't like the name Hamburg, but I could live with that.
>>
>> Mark
>>
>> On Thu, Sep 3, 2009 at 6:36 PM, Ted Dunning <te...@gmail.com> wrote:
>>
>>> Hamburg has been excessively "stable" for some time.  If you want to do
>>> something, I would recommend contributing to Mahout.
>>>
>>> On Thu, Sep 3, 2009 at 3:51 PM, Ashutosh Chauhan <
>>> ashutosh.chauhan@gmail.com
>>> > wrote:
>>>
>>> > Hamburg is here: http://wiki.apache.org/hadoop/Hamburg
>>> >
>>> >
>>>
>>
>
>
>
> --
> Best Regards, Edward J. Yoon @ NHN, corp.
> edwardyoon@apache.org
> http://blog.udanax.org
>



-- 
Best Regards, Edward J. Yoon @ NHN, corp.
edwardyoon@apache.org
http://blog.udanax.org

Fwd: Pregel

Posted by "Edward J. Yoon" <ed...@apache.org>.

Just FYI, We've already made a prototype of Hamburg based on BSP model.

I guess we could also improve the performance of matrix operations
using BSP computing model.

---------- Forwarded message ----------
From: Edward J. Yoon <ed...@apache.org>
Date: Fri, Sep 4, 2009 at 9:57 AM
Subject: Re: Pregel
To: common-user@hadoop.apache.org


We've already made a prototype of Hamburg based on multi thread. It's
a BSP based graph computing framework, not a M/R based application.

Please Join to ... http://groups.google.com/group/hamburg-dev

On Fri, Sep 4, 2009 at 8:45 AM, Mark Kerzner<ma...@gmail.com> wrote:
> But Ted,
> I am interested specifically in Pregel kind of system, for distributed graph
> operations, and Mahout is for distributed learning. Here what I would
> ideally like to do:
>
>
>   - Somebody must have info on Pregel - it's out, has been presented, and
>   the information is public. Anybody has been there and can at least re-tell?
>   The subscriptions are expensive, but I could buy one copy of the article, if
>   it is available. I will also write directly to the authors;
>   - Study that and discuss relevant information and architecture;
>   - Do the first implementation.
>
> I personally don't like the name Hamburg, but I could live with that.
>
> Mark
>
> On Thu, Sep 3, 2009 at 6:36 PM, Ted Dunning <te...@gmail.com> wrote:
>
>> Hamburg has been excessively "stable" for some time.  If you want to do
>> something, I would recommend contributing to Mahout.
>>
>> On Thu, Sep 3, 2009 at 3:51 PM, Ashutosh Chauhan <
>> ashutosh.chauhan@gmail.com
>> > wrote:
>>
>> > Hamburg is here: http://wiki.apache.org/hadoop/Hamburg
>> >
>> >
>>
>



--
Best Regards, Edward J. Yoon @ NHN, corp.
edwardyoon@apache.org
http://blog.udanax.org



-- 
Best Regards, Edward J. Yoon @ NHN, corp.
edwardyoon@apache.org
http://blog.udanax.org

Re: Pregel

Posted by "Edward J. Yoon" <ed...@apache.org>.

We've already made a prototype of Hamburg based on multi thread. It's
a BSP based graph computing framework, not a M/R based application.

Please Join to ... http://groups.google.com/group/hamburg-dev

On Fri, Sep 4, 2009 at 8:45 AM, Mark Kerzner<ma...@gmail.com> wrote:
> But Ted,
> I am interested specifically in Pregel kind of system, for distributed graph
> operations, and Mahout is for distributed learning. Here what I would
> ideally like to do:
>
>
>   - Somebody must have info on Pregel - it's out, has been presented, and
>   the information is public. Anybody has been there and can at least re-tell?
>   The subscriptions are expensive, but I could buy one copy of the article, if
>   it is available. I will also write directly to the authors;
>   - Study that and discuss relevant information and architecture;
>   - Do the first implementation.
>
> I personally don't like the name Hamburg, but I could live with that.
>
> Mark
>
> On Thu, Sep 3, 2009 at 6:36 PM, Ted Dunning <te...@gmail.com> wrote:
>
>> Hamburg has been excessively "stable" for some time.  If you want to do
>> something, I would recommend contributing to Mahout.
>>
>> On Thu, Sep 3, 2009 at 3:51 PM, Ashutosh Chauhan <
>> ashutosh.chauhan@gmail.com
>> > wrote:
>>
>> > Hamburg is here: http://wiki.apache.org/hadoop/Hamburg
>> >
>> >
>>
>



-- 
Best Regards, Edward J. Yoon @ NHN, corp.
edwardyoon@apache.org
http://blog.udanax.org

Re: Pregel

Posted by Mark Kerzner <ma...@gmail.com>.

But Ted,
I am interested specifically in Pregel kind of system, for distributed graph
operations, and Mahout is for distributed learning. Here what I would
ideally like to do:

   - Somebody must have info on Pregel - it's out, has been presented, and
   the information is public. Anybody has been there and can at least re-tell?
   The subscriptions are expensive, but I could buy one copy of the article, if
   it is available. I will also write directly to the authors;
   - Study that and discuss relevant information and architecture;
   - Do the first implementation.

I personally don't like the name Hamburg, but I could live with that.

Mark

On Thu, Sep 3, 2009 at 6:36 PM, Ted Dunning <te...@gmail.com> wrote:

> Hamburg has been excessively "stable" for some time.  If you want to do
> something, I would recommend contributing to Mahout.
>
> On Thu, Sep 3, 2009 at 3:51 PM, Ashutosh Chauhan <
> ashutosh.chauhan@gmail.com
> > wrote:
>
> > Hamburg is here: http://wiki.apache.org/hadoop/Hamburg
> >
> >
>

Re: Pregel

Posted by Ted Dunning <te...@gmail.com>.

Hamburg has been excessively "stable" for some time.  If you want to do
something, I would recommend contributing to Mahout.

On Thu, Sep 3, 2009 at 3:51 PM, Ashutosh Chauhan <ashutosh.chauhan@gmail.com
> wrote:

> Hamburg is here: http://wiki.apache.org/hadoop/Hamburg
>
>

Re: JobTracker startup failure when starting hadoop-0.20.0 cluster on Amazon EC2 with contrib/ec2 scripts

Posted by Tom White <to...@cloudera.com>.

Hi Jeyendran,

Were there any errors reported in the datanode logs? There could be a
problem with datanodes contacting the namenode, caused by firewall
configuration problems (EC2 security groups).

Cheers,
Tom

On Fri, Sep 4, 2009 at 12:17 AM, Jeyendran
Balakrishnan<jb...@docomolabs-usa.com> wrote:
> I downloaded Hadoop 0.20.0 and used the src/contrib/ec2/bin scripts to
> launch a Hadoop cluster on Amazon EC2, after building a new Hadoop
> 0.20.0 AMI.
>
> I launched an instance with my new Hadoop 0.20.0 AMI, then logged in and
> ran the following to launch a new cluster:
> root(/vol/hadoop-0.20.0)> bin/launch-hadoop-cluster hadoop-test 2
>
> After the usual EC2 wait, one master and two slave instances were
> launched on EC2, as expected. When I ssh'ed into the instances, here is
> what I found:
>
> Slaves: DataNode and NameNode are running
> Master: Only NameNode is running
>
> I could use HDFS commands (using $HADOOP_HOME/bin/hadoop scripts)
> without any problems, from both master and slaves. However, since
> JobTracker is not running, I cannot run map-reduce jobs.
>
> I checked the logs from /vol/hadoop-0.20.0/logs for the JobTracker,
> reproduced below:
> ------------------------------------------------------------------------
> ----
> 2009-09-03 18:55:38,486 WARN org.apache.hadoop.conf.Configuration:
> DEPRECATED: hadoop-site.xml found in the classpath. Usage of
> hadoop-site.xml is deprecated. Instead use core-site.xml,
> mapred-site.xml and h
> dfs-site.xml to override properties of core-default.xml,
> mapred-default.xml and hdfs-default.xml respectively
> 2009-09-03 18:55:38,520 INFO org.apache.hadoop.mapred.JobTracker:
> STARTUP_MSG:
> /************************************************************
> STARTUP_MSG: Starting JobTracker
> STARTUP_MSG:   host =
> domU-12-31-39-06-44-E3.compute-1.internal/10.208.75.17
> STARTUP_MSG:   args = []
> STARTUP_MSG:   version = 0.20.0
> STARTUP_MSG:   build =
> https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.20 -r
> 763504; compiled by 'ndaley' on Thu Apr  9 05:18:40 UTC 2009
> ************************************************************/
> 2009-09-03 18:55:38,652 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
> Initializing RPC Metrics with hostName=JobTracker, port=50002
> 2009-09-03 18:55:38,703 INFO org.mortbay.log: Logging to
> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
> org.mortbay.log.Slf4jLog
> 2009-09-03 18:55:38,827 INFO org.apache.hadoop.http.HttpServer: Jetty
> bound to port 50030
> 2009-09-03 18:55:38,827 INFO org.mortbay.log: jetty-6.1.14
> 2009-09-03 18:55:48,425 INFO org.mortbay.log: Started
> SelectChannelConnector@0.0.0.0:50030
> 2009-09-03 18:55:48,427 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
> Initializing JVM Metrics with processName=JobTracker, sessionId=
> 2009-09-03 18:55:48,432 INFO org.apache.hadoop.mapred.JobTracker:
> JobTracker up at: 50002
> 2009-09-03 18:55:48,432 INFO org.apache.hadoop.mapred.JobTracker:
> JobTracker webserver: 50030
> 2009-09-03 18:55:48,541 INFO org.apache.hadoop.mapred.JobTracker:
> Cleaning up the system directory
> 2009-09-03 18:55:48,628 INFO org.apache.hadoop.hdfs.DFSClient:
> org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
> /mnt/hadoop/mapred/system/jobtracker.info could only be replicated to 0
> nodes,
> instead of 1
>        at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(F
> SNamesystem.java:1256)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:4
> 22)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav
> a:39)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
> Impl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
>        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
>        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at javax.security.auth.Subject.doAs(Subject.java:396)
>        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
>
>        at org.apache.hadoop.ipc.Client.call(Client.java:739)
>        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
>        at $Proxy4.addBlock(Unknown Source)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav
> a:39)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
> Impl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvo
> cationHandler.java:82)
>        at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocation
> Handler.java:59)
>        at $Proxy4.addBlock(Unknown Source)
>        at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DF
> SClient.java:2873)
>        at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(D
> FSClient.java:2755)
>
>        at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.j
> ava:2046)
>        at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSCli
> ent.java:2232)
>
> 2009-09-03 18:55:48,628 WARN org.apache.hadoop.hdfs.DFSClient:
> NotReplicatedYetException sleeping
> /mnt/hadoop/mapred/system/jobtracker.info retries left 4
> 2009-09-03 18:55:49,030 INFO org.apache.hadoop.hdfs.DFSClient:
> org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
> /mnt/hadoop/mapred/system/jobtracker.info could only be replicated to 0
> nodes,
> instead of 1
> ...
> ------------------------------------------------------------------------
> ----
> The JobTracker ports are all free [not used by any other process].
>
> Any suggestions would be appreciated.
>
> Thanks a lot,
> Jeyendran
>
>

Re: JobTracker startup failure when starting hadoop-0.20.0 cluster on Amazon EC2 with contrib/ec2 scripts

Posted by Yuanyuan Tian <yt...@us.ibm.com>.

Hi Jeyendran,

I have exactly the same problem as you when setting up hadoop 20 on EC2. I
found your post through google. I was wondering whether you've found a
solution yet. Or, anyone has a solution for this?

Thanks,

Yuanyuan


|------------>
| From:      |
|------------>
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
  |"Jeyendran Balakrishnan" <jb...@docomolabs-usa.com>                                                                                       |
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| To:        |
|------------>
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
  |<co...@hadoop.apache.org>                                                                                                                   |
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Date:      |
|------------>
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
  |09/03/2009 04:18 PM                                                                                                                               |
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Subject:   |
|------------>
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
  |JobTracker startup failure when starting hadoop-0.20.0 cluster on Amazon EC2 with contrib/ec2 scripts                                             |
  >--------------------------------------------------------------------------------------------------------------------------------------------------|





I downloaded Hadoop 0.20.0 and used the src/contrib/ec2/bin scripts to
launch a Hadoop cluster on Amazon EC2, after building a new Hadoop
0.20.0 AMI.

I launched an instance with my new Hadoop 0.20.0 AMI, then logged in and
ran the following to launch a new cluster:
root(/vol/hadoop-0.20.0)> bin/launch-hadoop-cluster hadoop-test 2

After the usual EC2 wait, one master and two slave instances were
launched on EC2, as expected. When I ssh'ed into the instances, here is
what I found:

Slaves: DataNode and NameNode are running
Master: Only NameNode is running

I could use HDFS commands (using $HADOOP_HOME/bin/hadoop scripts)
without any problems, from both master and slaves. However, since
JobTracker is not running, I cannot run map-reduce jobs.

I checked the logs from /vol/hadoop-0.20.0/logs for the JobTracker,
reproduced below:
------------------------------------------------------------------------
----
2009-09-03 18:55:38,486 WARN org.apache.hadoop.conf.Configuration:
DEPRECATED: hadoop-site.xml found in the classpath. Usage of
hadoop-site.xml is deprecated. Instead use core-site.xml,
mapred-site.xml and h
dfs-site.xml to override properties of core-default.xml,
mapred-default.xml and hdfs-default.xml respectively
2009-09-03 18:55:38,520 INFO org.apache.hadoop.mapred.JobTracker:
STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting JobTracker
STARTUP_MSG:   host =
domU-12-31-39-06-44-E3.compute-1.internal/10.208.75.17
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 0.20.0
STARTUP_MSG:   build =
https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.20 -r
763504; compiled by 'ndaley' on Thu Apr  9 05:18:40 UTC 2009
************************************************************/
2009-09-03 18:55:38,652 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
Initializing RPC Metrics with hostName=JobTracker, port=50002
2009-09-03 18:55:38,703 INFO org.mortbay.log: Logging to
org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
org.mortbay.log.Slf4jLog
2009-09-03 18:55:38,827 INFO org.apache.hadoop.http.HttpServer: Jetty
bound to port 50030
2009-09-03 18:55:38,827 INFO org.mortbay.log: jetty-6.1.14
2009-09-03 18:55:48,425 INFO org.mortbay.log: Started
SelectChannelConnector@0.0.0.0:50030
2009-09-03 18:55:48,427 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
Initializing JVM Metrics with processName=JobTracker, sessionId=
2009-09-03 18:55:48,432 INFO org.apache.hadoop.mapred.JobTracker:
JobTracker up at: 50002
2009-09-03 18:55:48,432 INFO org.apache.hadoop.mapred.JobTracker:
JobTracker webserver: 50030
2009-09-03 18:55:48,541 INFO org.apache.hadoop.mapred.JobTracker:
Cleaning up the system directory
2009-09-03 18:55:48,628 INFO org.apache.hadoop.hdfs.DFSClient:
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/mnt/hadoop/mapred/system/jobtracker.info could only be replicated to 0
nodes,
instead of 1
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(F
SNamesystem.java:1256)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:4
22)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav
a:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
Impl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)

        at org.apache.hadoop.ipc.Client.call(Client.java:739)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
        at $Proxy4.addBlock(Unknown Source)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav
a:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
Impl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvo
cationHandler.java:82)
        at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocation
Handler.java:59)
        at $Proxy4.addBlock(Unknown Source)
        at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DF
SClient.java:2873)
        at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(D
FSClient.java:2755)

        at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.j
ava:2046)
        at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSCli
ent.java:2232)

2009-09-03 18:55:48,628 WARN org.apache.hadoop.hdfs.DFSClient:
NotReplicatedYetException sleeping
/mnt/hadoop/mapred/system/jobtracker.info retries left 4
2009-09-03 18:55:49,030 INFO org.apache.hadoop.hdfs.DFSClient:
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/mnt/hadoop/mapred/system/jobtracker.info could only be replicated to 0
nodes,
instead of 1
...
------------------------------------------------------------------------
----
The JobTracker ports are all free [not used by any other process].

Any suggestions would be appreciated.

Thanks a lot,
Jeyendran

JobTracker startup failure when starting hadoop-0.20.0 cluster on Amazon EC2 with contrib/ec2 scripts

Posted by Jeyendran Balakrishnan <jb...@docomolabs-usa.com>.

I downloaded Hadoop 0.20.0 and used the src/contrib/ec2/bin scripts to
launch a Hadoop cluster on Amazon EC2, after building a new Hadoop
0.20.0 AMI. 

I launched an instance with my new Hadoop 0.20.0 AMI, then logged in and
ran the following to launch a new cluster:
root(/vol/hadoop-0.20.0)> bin/launch-hadoop-cluster hadoop-test 2

After the usual EC2 wait, one master and two slave instances were
launched on EC2, as expected. When I ssh'ed into the instances, here is
what I found:

Slaves: DataNode and NameNode are running
Master: Only NameNode is running

I could use HDFS commands (using $HADOOP_HOME/bin/hadoop scripts)
without any problems, from both master and slaves. However, since
JobTracker is not running, I cannot run map-reduce jobs.

I checked the logs from /vol/hadoop-0.20.0/logs for the JobTracker,
reproduced below:
------------------------------------------------------------------------
----
2009-09-03 18:55:38,486 WARN org.apache.hadoop.conf.Configuration:
DEPRECATED: hadoop-site.xml found in the classpath. Usage of
hadoop-site.xml is deprecated. Instead use core-site.xml,
mapred-site.xml and h
dfs-site.xml to override properties of core-default.xml,
mapred-default.xml and hdfs-default.xml respectively
2009-09-03 18:55:38,520 INFO org.apache.hadoop.mapred.JobTracker:
STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting JobTracker
STARTUP_MSG:   host =
domU-12-31-39-06-44-E3.compute-1.internal/10.208.75.17
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 0.20.0
STARTUP_MSG:   build =
https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.20 -r
763504; compiled by 'ndaley' on Thu Apr  9 05:18:40 UTC 2009
************************************************************/
2009-09-03 18:55:38,652 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
Initializing RPC Metrics with hostName=JobTracker, port=50002
2009-09-03 18:55:38,703 INFO org.mortbay.log: Logging to
org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
org.mortbay.log.Slf4jLog
2009-09-03 18:55:38,827 INFO org.apache.hadoop.http.HttpServer: Jetty
bound to port 50030
2009-09-03 18:55:38,827 INFO org.mortbay.log: jetty-6.1.14
2009-09-03 18:55:48,425 INFO org.mortbay.log: Started
SelectChannelConnector@0.0.0.0:50030
2009-09-03 18:55:48,427 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
Initializing JVM Metrics with processName=JobTracker, sessionId=
2009-09-03 18:55:48,432 INFO org.apache.hadoop.mapred.JobTracker:
JobTracker up at: 50002
2009-09-03 18:55:48,432 INFO org.apache.hadoop.mapred.JobTracker:
JobTracker webserver: 50030
2009-09-03 18:55:48,541 INFO org.apache.hadoop.mapred.JobTracker:
Cleaning up the system directory
2009-09-03 18:55:48,628 INFO org.apache.hadoop.hdfs.DFSClient:
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/mnt/hadoop/mapred/system/jobtracker.info could only be replicated to 0
nodes, 
instead of 1
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(F
SNamesystem.java:1256)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:4
22)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav
a:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
Impl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)

        at org.apache.hadoop.ipc.Client.call(Client.java:739)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
        at $Proxy4.addBlock(Unknown Source)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav
a:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
Impl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvo
cationHandler.java:82)
        at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocation
Handler.java:59)
        at $Proxy4.addBlock(Unknown Source)
        at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DF
SClient.java:2873)
        at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(D
FSClient.java:2755)

        at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.j
ava:2046)
        at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSCli
ent.java:2232)

2009-09-03 18:55:48,628 WARN org.apache.hadoop.hdfs.DFSClient:
NotReplicatedYetException sleeping
/mnt/hadoop/mapred/system/jobtracker.info retries left 4
2009-09-03 18:55:49,030 INFO org.apache.hadoop.hdfs.DFSClient:
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/mnt/hadoop/mapred/system/jobtracker.info could only be replicated to 0
nodes, 
instead of 1
...
------------------------------------------------------------------------
----
The JobTracker ports are all free [not used by any other process].

Any suggestions would be appreciated.

Thanks a lot,
Jeyendran

Re: Pregel

Posted by Ashutosh Chauhan <as...@gmail.com>.

Hamburg is here: http://wiki.apache.org/hadoop/Hamburg

Ashutosh

On Thu, Sep 3, 2009 at 16:42, Mark Kerzner <ma...@gmail.com> wrote:

> Ok, then, I can join hamburg. Where is it?
>
> On Thu, Sep 3, 2009 at 3:12 PM, Amandeep Khurana <am...@gmail.com> wrote:
>
> > There is another project-  Hamburg - on similar lines. Check that out
> too.
> >
> >
> > Amandeep Khurana
> > Computer Science Graduate Student
> > University of California, Santa Cruz
> >
> >
> > On Thu, Sep 3, 2009 at 1:08 PM, Mark Kerzner <ma...@gmail.com>
> > wrote:
> >
> > > Brenta
> > > http://en.wikipedia.org/wiki/Brenta_(river)<http://en.wikipedia.org/wiki/Brenta_%28river%29>
> <
> > http://en.wikipedia.org/wiki/Brenta_%28river%29>
> > >
> > > On Thu, Sep 3, 2009 at 3:07 PM, Mark Kerzner <ma...@gmail.com>
> > > wrote:
> > >
> > > > Then we should think of a name and create the project somewhere. Does
> > not
> > > > have to be the same place as Hadoop, can be Google code to start
> > with...
> > > > How about
> > > >
> > > > Madoop
> > > > Mississippi (221 bridges)
> > > > Danube (lotsa bridges)
> > > >
> > > > Mark
> > > >
> > > >
> > > >
> > > > On Thu, Sep 3, 2009 at 2:53 PM, Amandeep Khurana <am...@gmail.com>
> > > wrote:
> > > >
> > > >> I'm interested in working on it.
> > > >>
> > > >> The paper is still not out.. Only the summary has been made
> available.
> > > Am
> > > >> I
> > > >> missing something?
> > > >>
> > > >>
> > > >> Amandeep Khurana
> > > >> Computer Science Graduate Student
> > > >> University of California, Santa Cruz
> > > >>
> > > >>
> > > >> On Thu, Sep 3, 2009 at 9:46 AM, Mark Kerzner <markkerzner@gmail.com
> >
> > > >> wrote:
> > > >>
> > > >> > Hi, guys,
> > > >> >
> > > >> > Pregel has been revealed on 8/11, what is your opinion of, does
> > > anybody
> > > >> > know
> > > >> > how to get the presentation, and is anyone interested in
> > implementing
> > > >> it?
> > > >> >
> > > >> > Thank you,
> > > >> > Mark
> > > >> >
> > > >>
> > > >
> > > >
> > >
> >
>

Re: Pregel

Posted by Mark Kerzner <ma...@gmail.com>.

Ok, then, I can join hamburg. Where is it?

On Thu, Sep 3, 2009 at 3:12 PM, Amandeep Khurana <am...@gmail.com> wrote:

> There is another project-  Hamburg - on similar lines. Check that out too.
>
>
> Amandeep Khurana
> Computer Science Graduate Student
> University of California, Santa Cruz
>
>
> On Thu, Sep 3, 2009 at 1:08 PM, Mark Kerzner <ma...@gmail.com>
> wrote:
>
> > Brenta
> > http://en.wikipedia.org/wiki/Brenta_(river)<
> http://en.wikipedia.org/wiki/Brenta_%28river%29>
> >
> > On Thu, Sep 3, 2009 at 3:07 PM, Mark Kerzner <ma...@gmail.com>
> > wrote:
> >
> > > Then we should think of a name and create the project somewhere. Does
> not
> > > have to be the same place as Hadoop, can be Google code to start
> with...
> > > How about
> > >
> > > Madoop
> > > Mississippi (221 bridges)
> > > Danube (lotsa bridges)
> > >
> > > Mark
> > >
> > >
> > >
> > > On Thu, Sep 3, 2009 at 2:53 PM, Amandeep Khurana <am...@gmail.com>
> > wrote:
> > >
> > >> I'm interested in working on it.
> > >>
> > >> The paper is still not out.. Only the summary has been made available.
> > Am
> > >> I
> > >> missing something?
> > >>
> > >>
> > >> Amandeep Khurana
> > >> Computer Science Graduate Student
> > >> University of California, Santa Cruz
> > >>
> > >>
> > >> On Thu, Sep 3, 2009 at 9:46 AM, Mark Kerzner <ma...@gmail.com>
> > >> wrote:
> > >>
> > >> > Hi, guys,
> > >> >
> > >> > Pregel has been revealed on 8/11, what is your opinion of, does
> > anybody
> > >> > know
> > >> > how to get the presentation, and is anyone interested in
> implementing
> > >> it?
> > >> >
> > >> > Thank you,
> > >> > Mark
> > >> >
> > >>
> > >
> > >
> >
>

Re: Pregel

Posted by Amandeep Khurana <am...@gmail.com>.

There is another project-  Hamburg - on similar lines. Check that out too.


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz


On Thu, Sep 3, 2009 at 1:08 PM, Mark Kerzner <ma...@gmail.com> wrote:

> Brenta
> http://en.wikipedia.org/wiki/Brenta_(river)<http://en.wikipedia.org/wiki/Brenta_%28river%29>
>
> On Thu, Sep 3, 2009 at 3:07 PM, Mark Kerzner <ma...@gmail.com>
> wrote:
>
> > Then we should think of a name and create the project somewhere. Does not
> > have to be the same place as Hadoop, can be Google code to start with...
> > How about
> >
> > Madoop
> > Mississippi (221 bridges)
> > Danube (lotsa bridges)
> >
> > Mark
> >
> >
> >
> > On Thu, Sep 3, 2009 at 2:53 PM, Amandeep Khurana <am...@gmail.com>
> wrote:
> >
> >> I'm interested in working on it.
> >>
> >> The paper is still not out.. Only the summary has been made available.
> Am
> >> I
> >> missing something?
> >>
> >>
> >> Amandeep Khurana
> >> Computer Science Graduate Student
> >> University of California, Santa Cruz
> >>
> >>
> >> On Thu, Sep 3, 2009 at 9:46 AM, Mark Kerzner <ma...@gmail.com>
> >> wrote:
> >>
> >> > Hi, guys,
> >> >
> >> > Pregel has been revealed on 8/11, what is your opinion of, does
> anybody
> >> > know
> >> > how to get the presentation, and is anyone interested in implementing
> >> it?
> >> >
> >> > Thank you,
> >> > Mark
> >> >
> >>
> >
> >
>

Re: Pregel

Posted by Mark Kerzner <ma...@gmail.com>.

Brenta
http://en.wikipedia.org/wiki/Brenta_(river)

On Thu, Sep 3, 2009 at 3:07 PM, Mark Kerzner <ma...@gmail.com> wrote:

> Then we should think of a name and create the project somewhere. Does not
> have to be the same place as Hadoop, can be Google code to start with...
> How about
>
> Madoop
> Mississippi (221 bridges)
> Danube (lotsa bridges)
>
> Mark
>
>
>
> On Thu, Sep 3, 2009 at 2:53 PM, Amandeep Khurana <am...@gmail.com> wrote:
>
>> I'm interested in working on it.
>>
>> The paper is still not out.. Only the summary has been made available. Am
>> I
>> missing something?
>>
>>
>> Amandeep Khurana
>> Computer Science Graduate Student
>> University of California, Santa Cruz
>>
>>
>> On Thu, Sep 3, 2009 at 9:46 AM, Mark Kerzner <ma...@gmail.com>
>> wrote:
>>
>> > Hi, guys,
>> >
>> > Pregel has been revealed on 8/11, what is your opinion of, does anybody
>> > know
>> > how to get the presentation, and is anyone interested in implementing
>> it?
>> >
>> > Thank you,
>> > Mark
>> >
>>
>
>

Re: Pregel

Posted by Mark Kerzner <ma...@gmail.com>.

Then we should think of a name and create the project somewhere. Does not
have to be the same place as Hadoop, can be Google code to start with...
How about

Madoop
Mississippi (221 bridges)
Danube (lotsa bridges)

Mark



On Thu, Sep 3, 2009 at 2:53 PM, Amandeep Khurana <am...@gmail.com> wrote:

> I'm interested in working on it.
>
> The paper is still not out.. Only the summary has been made available. Am I
> missing something?
>
>
> Amandeep Khurana
> Computer Science Graduate Student
> University of California, Santa Cruz
>
>
> On Thu, Sep 3, 2009 at 9:46 AM, Mark Kerzner <ma...@gmail.com>
> wrote:
>
> > Hi, guys,
> >
> > Pregel has been revealed on 8/11, what is your opinion of, does anybody
> > know
> > how to get the presentation, and is anyone interested in implementing it?
> >
> > Thank you,
> > Mark
> >
>

Re: Pregel

Posted by Amandeep Khurana <am...@gmail.com>.

I'm interested in working on it.

The paper is still not out.. Only the summary has been made available. Am I
missing something?

Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz

On Thu, Sep 3, 2009 at 9:46 AM, Mark Kerzner <ma...@gmail.com> wrote:

> Hi, guys,
>
> Pregel has been revealed on 8/11, what is your opinion of, does anybody
> know
> how to get the presentation, and is anyone interested in implementing it?
>
> Thank you,
> Mark
>