You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@samza.apache.org by Telles Nobrega <te...@gmail.com> on 2014/08/08 16:55:49 UTC

Running Job on Multinode Yarn Cluster

Hi,

this is my first time trying to run a job on a multinode environment. I
have the cluster set up, I can see in the GUI that all nodes are working.
Do I need to have the job folder on each machine in my cluster?
 - The first time I tried running with the job on the namenode machine and
it failed saying:

Application application_1407509228798_0001 failed 2 times due to AM
Container for appattempt_1407509228798_0001_000002 exited with exitCode:
-1000 due to: File
file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-package-0.7.0-dist.tar.gz
does not exist

So I copied the folder to each machine in my cluster and got this error:

Application application_1407509228798_0002 failed 2 times due to AM
Container for appattempt_1407509228798_0002_000002 exited with exitCode:
-1000 due to: Resource
file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-package-0.7.0-dist.tar.gz
changed on src filesystem (expected 1407509168000, was 1407509434000

What am I missing?

p.s.: I followed this
<https://github.com/yahoo/samoa/wiki/Executing-SAMOA-with-Apache-Samza>
tutorial
and this
<http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-node-yarn.html>
to
set up the cluster.

Help is much appreciated.

Thanks in advance.

-- 
------------------------------------------
Telles Mota Vidal Nobrega
M.sc. Candidate at UFCG
B.sc. in Computer Science at UFCG
Software Engineer at OpenStack Project - HP/LSD-UFCG

Re: Running Job on Multinode Yarn Cluster

Posted by Yan Fang <ya...@gmail.com>.
Cool. Glad to hear that. Thank you.

Fang, Yan
yanfang724@gmail.com
+1 (206) 849-4108


On Tue, Aug 12, 2014 at 9:49 AM, Telles Nobrega <te...@gmail.com>
wrote:

> That was the problem, thanks for the help I was able to run it.
>
> I really appreciatte all the time you guys tpok to help me out.
>
>
>
> On Tue, Aug 12, 2014 at 1:43 PM, Yan Fang <ya...@gmail.com> wrote:
>
> > Yes, tar.gz should have all the necessary libs. If this error does not
> pop
> > up when you run "run-job", my guess is that you may forget to reupload
> the
> > tar.gz package after you recompile.
> >
> > Fang, Yan
> > yanfang724@gmail.com
> > +1 (206) 849-4108
> >
> >
> > On Tue, Aug 12, 2014 at 6:34 AM, Telles Nobrega <tellesnobrega@gmail.com
> >
> > wrote:
> >
> > > What is the expected behavior here. The tar.gz file is in hdfs, it
> should
> > > find all necessary libs in the tar.gz right?
> > >
> > >
> > > On Tue, Aug 12, 2014 at 10:19 AM, Telles Nobrega <
> > tellesnobrega@gmail.com>
> > > wrote:
> > >
> > > > Chris and Yan,
> > > >
> > > > I was able to run the job but I got the error:
> > > >
> > > > Exception in thread "main" java.util.ServiceConfigurationError:
> > > > org.apache.hadoop.fs.FileSystem: Provider
> > > > org.apache.hadoop.hdfs.DistributedFileSystem could not be
> instantiated
> > > >         at java.util.ServiceLoader.fail(ServiceLoader.java:224)
> > > >         at java.util.ServiceLoader.access$100(ServiceLoader.java:181)
> > > >         at
> > > > java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:377)
> > > >         at java.util.ServiceLoader$1.next(ServiceLoader.java:445)
> > > >         at
> > > > org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2400)
> > > >         at
> > > >
> > org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2411)
> > > >         at
> > > >
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428)
> > > >         at
> > org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
> > > >         at
> > > >
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
> > > >         at
> > > org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
> > > >         at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
> > > >         at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
> > > >         at
> > > >
> > >
> >
> org.apache.samza.job.yarn.SamzaAppMasterTaskManager.startContainer(SamzaAppMasterTaskManager.scala:278)
> > > >          at
> > > >
> > >
> >
> org.apache.samza.job.yarn.SamzaAppMasterTaskManager.onContainerAllocated(SamzaAppMasterTaskManager.scala:126)
> > > >         at
> > > >
> > >
> >
> org.apache.samza.job.yarn.YarnAppMaster$$anonfun$run$8$$anonfun$apply$2.apply(YarnAppMaster.scala:66)
> > > >         at
> > > >
> > >
> >
> org.apache.samza.job.yarn.YarnAppMaster$$anonfun$run$8$$anonfun$apply$2.apply(YarnAppMaster.scala:66)
> > > >         at scala.collection.immutable.List.foreach(List.scala:318)
> > > >         at
> > > >
> > >
> >
> org.apache.samza.job.yarn.YarnAppMaster$$anonfun$run$8.apply(YarnAppMaster.scala:66)
> > > >         at
> > > >
> > >
> >
> org.apache.samza.job.yarn.YarnAppMaster$$anonfun$run$8.apply(YarnAppMaster.scala:66)
> > > >         at
> scala.collection.Iterator$class.foreach(Iterator.scala:727)
> > > >         at
> > scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> > > >         at
> > > > scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
> > > >         at
> scala.collection.AbstractIterable.foreach(Iterable.scala:54)
> > > >         at
> > > > org.apache.samza.job.yarn.YarnAppMaster.run(YarnAppMaster.scala:66)
> > > >         at
> > > >
> org.apache.samza.job.yarn.SamzaAppMaster$.main(SamzaAppMaster.scala:81)
> > > >         at
> > > > org.apache.samza.job.yarn.SamzaAppMaster.main(SamzaAppMaster.scala)
> > > >  Caused by: java.lang.NoClassDefFoundError:
> > > > org/apache/hadoop/conf/Configuration$DeprecationDelta
> > > >         at
> > > >
> > >
> >
> org.apache.hadoop.hdfs.HdfsConfiguration.addDeprecatedKeys(HdfsConfiguration.java:66)
> > > >         at
> > > >
> > >
> >
> org.apache.hadoop.hdfs.HdfsConfiguration.<clinit>(HdfsConfiguration.java:31)
> > > >         at
> > > >
> > >
> >
> org.apache.hadoop.hdfs.DistributedFileSystem.<clinit>(DistributedFileSystem.java:106)
> > > >         at
> > sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> > > > Method)
> > > >         at
> > > >
> > >
> >
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> > > >         at
> > > >
> > >
> >
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> > > >         at
> > > java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> > > >         at java.lang.Class.newInstance(Class.java:374)
> > > >         at
> > > > java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:373)
> > > >          ... 23 more
> > > > Caused by: java.lang.ClassNotFoundException:
> > > > org.apache.hadoop.conf.Configuration$DeprecationDelta
> > > >         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> > > >         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> > > >         at java.security.AccessController.doPrivileged(Native Method)
> > > >         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> > > >         at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> > > >         at
> > sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> > > >         at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> > > >         ... 32 more
> > > >
> > > > In the machine that is running the job. Do I need to put the jar
> files
> > > > there too? and where?
> > > >
> > > > Thanks
> > > >
> > > >
> > > > On Tue, Aug 12, 2014 at 9:17 AM, Telles Nobrega <
> > tellesnobrega@gmail.com
> > > >
> > > > wrote:
> > > >
> > > >> Sorry for bothering this much.
> > > >>
> > > >>
> > > >> On Tue, Aug 12, 2014 at 9:17 AM, Telles Nobrega <
> > > tellesnobrega@gmail.com>
> > > >> wrote:
> > > >>
> > > >>> Now I have this error:
> > > >>>
> > > >>> Exception in thread "main" java.net.ConnectException: Call From
> > > >>> telles-samza-master/10.1.0.79 to telles-samza-master:8020 failed
> on
> > > >>> connection exception: java.net.ConnectException: Connection
> refused;
> > > For
> > > >>> more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
> > > >>>  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> > > >>> Method)
> > > >>> at
> > > >>>
> > >
> >
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> > > >>>  at
> > > >>>
> > >
> >
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> > > >>> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> > > >>>  at
> org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783)
> > > >>> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730)
> > > >>>  at org.apache.hadoop.ipc.Client.call(Client.java:1410)
> > > >>> at org.apache.hadoop.ipc.Client.call(Client.java:1359)
> > > >>>  at
> > > >>>
> > >
> >
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> > > >>> at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
> > > >>>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > > >>> at
> > > >>>
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> > > >>>  at
> > > >>>
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > > >>> at java.lang.reflect.Method.invoke(Method.java:606)
> > > >>>  at
> > > >>>
> > >
> >
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
> > > >>> at
> > > >>>
> > >
> >
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> > > >>>  at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
> > > >>> at
> > > >>>
> > >
> >
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:671)
> > > >>>  at
> org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1746)
> > > >>> at
> > > >>>
> > >
> >
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1112)
> > > >>>  at
> > > >>>
> > >
> >
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1108)
> > > >>> at
> > > >>>
> > >
> >
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> > > >>>  at
> > > >>>
> > >
> >
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1108)
> > > >>> at
> > > >>>
> > >
> >
> org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.scala:111)
> > > >>>  at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
> > > >>> at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
> > > >>>  at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
> > > >>> at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
> > > >>>  at org.apache.samza.job.JobRunner.main(JobRunner.scala)
> > > >>> Caused by: java.net.ConnectException: Connection refused
> > > >>>  at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> > > >>> at
> > > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
> > > >>>  at
> > > >>>
> > >
> >
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
> > > >>> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
> > > >>>  at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
> > > >>> at
> > > >>>
> > >
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:601)
> > > >>>  at
> > > >>>
> > org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:696)
> > > >>> at
> > org.apache.hadoop.ipc.Client$Connection.access$2700(Client.java:367)
> > > >>>  at org.apache.hadoop.ipc.Client.getConnection(Client.java:1458)
> > > >>> at org.apache.hadoop.ipc.Client.call(Client.java:1377)
> > > >>>  ... 22 more
> > > >>>
> > > >>>
> > > >>>
> > > >>> On Tue, Aug 12, 2014 at 3:39 AM, Yan Fang <ya...@gmail.com>
> > > wrote:
> > > >>>
> > > >>>> Hi Telles,
> > > >>>>
> > > >>>> I think you put the wrong port. Usually, the HDFS port is 8020,
> not
> > > >>>> 50070.
> > > >>>> You should put something like:
> > > >>>>
> > *hdfs://telles**-samza-master:8020*/path/to/samza-job-package.taz.gz.
> > > >>>> Thanks.
> > > >>>>
> > > >>>> Fang, Yan
> > > >>>> yanfang724@gmail.com
> > > >>>> +1 (206) 849-4108
> > > >>>>
> > > >>>>
> > > >>>> On Mon, Aug 11, 2014 at 8:31 PM, Telles Nobrega <
> > > >>>> tellesnobrega@gmail.com>
> > > >>>> wrote:
> > > >>>>
> > > >>>> > I tried moving from HDFS to HttpFileSystem. I’m getting the
> > > >>>> HttpFileSystem
> > > >>>> > not found exception. I have done the steps in the tutorial that
> > > Chris
> > > >>>> > pasted below (I had done that before, but I’m not sure what is
> the
> > > >>>> > problem). Seems like since I have the compiled file in one
> machine
> > > >>>> > (resource manager) and I submit it and try to download from the
> > node
> > > >>>> > managers, they don’t have samza-yarn.jar (don’t know how to
> > include
> > > >>>> it,
> > > >>>> > since the run will be done in the resource manager).
> > > >>>> >
> > > >>>> > Can you give me a tip on how to solve this?
> > > >>>> >
> > > >>>> > Thanks in advance.
> > > >>>> >
> > > >>>> > ps. the folder and tar.gz of the job are located in one machine
> > > >>>> alone, is
> > > >>>> > that the right way to do it or do I need to replicate
> hello-samza
> > in
> > > >>>> all
> > > >>>> > machines to run it?
> > > >>>> > On 11 Aug 2014, at 23:12, Telles Nobrega <
> tellesnobrega@gmail.com
> > >
> > > >>>> wrote:
> > > >>>> >
> > > >>>> > > What is your suggestion here, should I keep going on this
> quest
> > to
> > > >>>> fix
> > > >>>> > hdfs or should I try to run using HttpFileSystem?
> > > >>>> > > On 11 Aug 2014, at 23:01, Telles Nobrega <
> > tellesnobrega@gmail.com
> > > >
> > > >>>> > wrote:
> > > >>>> > >
> > > >>>> > >> The port is right?? 50700. I have no idea what is happening
> > now.
> > > >>>> > >>
> > > >>>> > >> On 11 Aug 2014, at 22:33, Telles Nobrega <
> > > tellesnobrega@gmail.com>
> > > >>>> > wrote:
> > > >>>> > >>
> > > >>>> > >>> Right now the error is the following:
> > > >>>> > >>> Exception in thread "main" java.io.IOException: Failed on
> > local
> > > >>>> > exception: com.google.protobuf.InvalidProtocolBufferException:
> > > >>>> Protocol
> > > >>>> > message end-group tag did not match expected tag.; Host Details
> :
> > > >>>> local
> > > >>>> > host is: "telles-samza-master/10.1.0.79"; destination host is:
> > > >>>> > "telles-samza-master":50070;
> > > >>>> > >>>     at
> > > >>>> org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
> > > >>>> > >>>     at org.apache.hadoop.ipc.Client.call(Client.java:1410)
> > > >>>> > >>>     at org.apache.hadoop.ipc.Client.call(Client.java:1359)
> > > >>>> > >>>     at
> > > >>>> >
> > > >>>>
> > >
> >
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> > > >>>> > >>>     at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
> > > >>>> > >>>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> > > Method)
> > > >>>> > >>>     at
> > > >>>> >
> > > >>>>
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> > > >>>> > >>>     at
> > > >>>> >
> > > >>>>
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > > >>>> > >>>     at java.lang.reflect.Method.invoke(Method.java:606)
> > > >>>> > >>>     at
> > > >>>> >
> > > >>>>
> > >
> >
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
> > > >>>> > >>>     at
> > > >>>> >
> > > >>>>
> > >
> >
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> > > >>>> > >>>     at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
> > > >>>> > >>>     at
> > > >>>> >
> > > >>>>
> > >
> >
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:671)
> > > >>>> > >>>     at
> > > >>>> >
> org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1746)
> > > >>>> > >>>     at
> > > >>>> >
> > > >>>>
> > >
> >
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1112)
> > > >>>> > >>>     at
> > > >>>> >
> > > >>>>
> > >
> >
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1108)
> > > >>>> > >>>     at
> > > >>>> >
> > > >>>>
> > >
> >
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> > > >>>> > >>>     at
> > > >>>> >
> > > >>>>
> > >
> >
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1108)
> > > >>>> > >>>     at
> > > >>>> >
> > > >>>>
> > >
> >
> org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.scala:111)
> > > >>>> > >>>     at
> > > org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
> > > >>>> > >>>     at
> > > org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
> > > >>>> > >>>     at
> org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
> > > >>>> > >>>     at
> > org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
> > > >>>> > >>>     at org.apache.samza.job.JobRunner.main(JobRunner.scala)
> > > >>>> > >>> Caused by:
> com.google.protobuf.InvalidProtocolBufferException:
> > > >>>> > Protocol message end-group tag did not match expected tag.
> > > >>>> > >>>     at
> > > >>>> >
> > > >>>>
> > >
> >
> com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:94)
> > > >>>> > >>>     at
> > > >>>> >
> > > >>>>
> > >
> >
> com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
> > > >>>> > >>>     at
> > > >>>> >
> > > >>>>
> > >
> >
> com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:202)
> > > >>>> > >>>     at
> > > >>>> >
> > > >>>>
> > >
> >
> com.google.protobuf.AbstractParser.parsePartialDelimitedFrom(AbstractParser.java:241)
> > > >>>> > >>>     at
> > > >>>> >
> > > >>>>
> > >
> >
> com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:253)
> > > >>>> > >>>     at
> > > >>>> >
> > > >>>>
> > >
> >
> com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:259)
> > > >>>> > >>>     at
> > > >>>> >
> > > >>>>
> > >
> >
> com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:49)
> > > >>>> > >>>     at
> > > >>>> >
> > > >>>>
> > >
> >
> org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcHeaderProtos.java:2364)
> > > >>>> > >>>     at
> > > >>>> >
> > > >>>>
> > >
> >
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1051)
> > > >>>> > >>>     at
> > > >>>> org.apache.hadoop.ipc.Client$Connection.run(Client.java:945)
> > > >>>> > >>>
> > > >>>> > >>> I feel that I’m close to making it run. Thanks for the help
> in
> > > >>>> advance.
> > > >>>> > >>> On 11 Aug 2014, at 22:06, Telles Nobrega <
> > > tellesnobrega@gmail.com
> > > >>>> >
> > > >>>> > wrote:
> > > >>>> > >>>
> > > >>>> > >>>> Hi, I downloaded hadoop-common-2.3.0.jar and it worked
> > better.
> > > >>>> Now
> > > >>>> > I’m having a configuration problem with my host, but it looks
> like
> > > >>>> the hdfs
> > > >>>> > is not a problem anymore.
> > > >>>> > >>>>
> > > >>>> > >>>>
> > > >>>> > >>>>
> > > >>>> > >>>>
> > > >>>> > >>>> On 11 Aug 2014, at 22:04, Telles Nobrega <
> > > >>>> tellesnobrega@gmail.com>
> > > >>>> > wrote:
> > > >>>> > >>>>
> > > >>>> > >>>>> So, I added hadoop-hdfs-2.3.0.jar as a maven dependency.
> > > >>>> Recompiled
> > > >>>> > the project, extracted to deploy/samza and there problem still
> > > >>>> happens. I
> > > >>>> > downloaded hadoop-client-2.3.0.jar and the problems still
> happens,
> > > >>>> > hadoop-common is 2.2.0 does this is a problem? I will try with
> > 2.3.0
> > > >>>> > >>>>>
> > > >>>> > >>>>> Actually a lot of hadoop jars are 2.2.0
> > > >>>> > >>>>>
> > > >>>> > >>>>> On 11 Aug 2014, at 21:33, Yan Fang <ya...@gmail.com>
> > > >>>> wrote:
> > > >>>> > >>>>>
> > > >>>> > >>>>>> <include>org.apache.hadoop:hadoop-hdfs</include>
> > > >>>> > >>>>>
> > > >>>> > >>>>
> > > >>>> > >>>
> > > >>>> > >>
> > > >>>> > >
> > > >>>> >
> > > >>>> >
> > > >>>>
> > > >>>
> > > >>>
> > > >>>
> > > >>> --
> > > >>> ------------------------------------------
> > > >>> Telles Mota Vidal Nobrega
> > > >>> M.sc. Candidate at UFCG
> > > >>> B.sc. in Computer Science at UFCG
> > > >>> Software Engineer at OpenStack Project - HP/LSD-UFCG
> > > >>>
> > > >>
> > > >>
> > > >>
> > > >> --
> > > >> ------------------------------------------
> > > >> Telles Mota Vidal Nobrega
> > > >> M.sc. Candidate at UFCG
> > > >> B.sc. in Computer Science at UFCG
> > > >> Software Engineer at OpenStack Project - HP/LSD-UFCG
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > ------------------------------------------
> > > > Telles Mota Vidal Nobrega
> > > > M.sc. Candidate at UFCG
> > > > B.sc. in Computer Science at UFCG
> > > > Software Engineer at OpenStack Project - HP/LSD-UFCG
> > > >
> > >
> > >
> > >
> > > --
> > > ------------------------------------------
> > > Telles Mota Vidal Nobrega
> > > M.sc. Candidate at UFCG
> > > B.sc. in Computer Science at UFCG
> > > Software Engineer at OpenStack Project - HP/LSD-UFCG
> > >
> >
>
>
>
> --
> ------------------------------------------
> Telles Mota Vidal Nobrega
> M.sc. Candidate at UFCG
> B.sc. in Computer Science at UFCG
> Software Engineer at OpenStack Project - HP/LSD-UFCG
>

Re: Running Job on Multinode Yarn Cluster

Posted by Telles Nobrega <te...@gmail.com>.
That was the problem, thanks for the help I was able to run it.

I really appreciatte all the time you guys tpok to help me out.



On Tue, Aug 12, 2014 at 1:43 PM, Yan Fang <ya...@gmail.com> wrote:

> Yes, tar.gz should have all the necessary libs. If this error does not pop
> up when you run "run-job", my guess is that you may forget to reupload the
> tar.gz package after you recompile.
>
> Fang, Yan
> yanfang724@gmail.com
> +1 (206) 849-4108
>
>
> On Tue, Aug 12, 2014 at 6:34 AM, Telles Nobrega <te...@gmail.com>
> wrote:
>
> > What is the expected behavior here. The tar.gz file is in hdfs, it should
> > find all necessary libs in the tar.gz right?
> >
> >
> > On Tue, Aug 12, 2014 at 10:19 AM, Telles Nobrega <
> tellesnobrega@gmail.com>
> > wrote:
> >
> > > Chris and Yan,
> > >
> > > I was able to run the job but I got the error:
> > >
> > > Exception in thread "main" java.util.ServiceConfigurationError:
> > > org.apache.hadoop.fs.FileSystem: Provider
> > > org.apache.hadoop.hdfs.DistributedFileSystem could not be instantiated
> > >         at java.util.ServiceLoader.fail(ServiceLoader.java:224)
> > >         at java.util.ServiceLoader.access$100(ServiceLoader.java:181)
> > >         at
> > > java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:377)
> > >         at java.util.ServiceLoader$1.next(ServiceLoader.java:445)
> > >         at
> > > org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2400)
> > >         at
> > >
> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2411)
> > >         at
> > > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428)
> > >         at
> org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
> > >         at
> > > org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
> > >         at
> > org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
> > >         at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
> > >         at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
> > >         at
> > >
> >
> org.apache.samza.job.yarn.SamzaAppMasterTaskManager.startContainer(SamzaAppMasterTaskManager.scala:278)
> > >          at
> > >
> >
> org.apache.samza.job.yarn.SamzaAppMasterTaskManager.onContainerAllocated(SamzaAppMasterTaskManager.scala:126)
> > >         at
> > >
> >
> org.apache.samza.job.yarn.YarnAppMaster$$anonfun$run$8$$anonfun$apply$2.apply(YarnAppMaster.scala:66)
> > >         at
> > >
> >
> org.apache.samza.job.yarn.YarnAppMaster$$anonfun$run$8$$anonfun$apply$2.apply(YarnAppMaster.scala:66)
> > >         at scala.collection.immutable.List.foreach(List.scala:318)
> > >         at
> > >
> >
> org.apache.samza.job.yarn.YarnAppMaster$$anonfun$run$8.apply(YarnAppMaster.scala:66)
> > >         at
> > >
> >
> org.apache.samza.job.yarn.YarnAppMaster$$anonfun$run$8.apply(YarnAppMaster.scala:66)
> > >         at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> > >         at
> scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> > >         at
> > > scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
> > >         at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
> > >         at
> > > org.apache.samza.job.yarn.YarnAppMaster.run(YarnAppMaster.scala:66)
> > >         at
> > > org.apache.samza.job.yarn.SamzaAppMaster$.main(SamzaAppMaster.scala:81)
> > >         at
> > > org.apache.samza.job.yarn.SamzaAppMaster.main(SamzaAppMaster.scala)
> > >  Caused by: java.lang.NoClassDefFoundError:
> > > org/apache/hadoop/conf/Configuration$DeprecationDelta
> > >         at
> > >
> >
> org.apache.hadoop.hdfs.HdfsConfiguration.addDeprecatedKeys(HdfsConfiguration.java:66)
> > >         at
> > >
> >
> org.apache.hadoop.hdfs.HdfsConfiguration.<clinit>(HdfsConfiguration.java:31)
> > >         at
> > >
> >
> org.apache.hadoop.hdfs.DistributedFileSystem.<clinit>(DistributedFileSystem.java:106)
> > >         at
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> > > Method)
> > >         at
> > >
> >
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> > >         at
> > >
> >
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> > >         at
> > java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> > >         at java.lang.Class.newInstance(Class.java:374)
> > >         at
> > > java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:373)
> > >          ... 23 more
> > > Caused by: java.lang.ClassNotFoundException:
> > > org.apache.hadoop.conf.Configuration$DeprecationDelta
> > >         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> > >         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> > >         at java.security.AccessController.doPrivileged(Native Method)
> > >         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> > >         at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> > >         at
> sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> > >         at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> > >         ... 32 more
> > >
> > > In the machine that is running the job. Do I need to put the jar files
> > > there too? and where?
> > >
> > > Thanks
> > >
> > >
> > > On Tue, Aug 12, 2014 at 9:17 AM, Telles Nobrega <
> tellesnobrega@gmail.com
> > >
> > > wrote:
> > >
> > >> Sorry for bothering this much.
> > >>
> > >>
> > >> On Tue, Aug 12, 2014 at 9:17 AM, Telles Nobrega <
> > tellesnobrega@gmail.com>
> > >> wrote:
> > >>
> > >>> Now I have this error:
> > >>>
> > >>> Exception in thread "main" java.net.ConnectException: Call From
> > >>> telles-samza-master/10.1.0.79 to telles-samza-master:8020 failed on
> > >>> connection exception: java.net.ConnectException: Connection refused;
> > For
> > >>> more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
> > >>>  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> > >>> Method)
> > >>> at
> > >>>
> >
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> > >>>  at
> > >>>
> >
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> > >>> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> > >>>  at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783)
> > >>> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730)
> > >>>  at org.apache.hadoop.ipc.Client.call(Client.java:1410)
> > >>> at org.apache.hadoop.ipc.Client.call(Client.java:1359)
> > >>>  at
> > >>>
> >
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> > >>> at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
> > >>>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >>> at
> > >>>
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> > >>>  at
> > >>>
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > >>> at java.lang.reflect.Method.invoke(Method.java:606)
> > >>>  at
> > >>>
> >
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
> > >>> at
> > >>>
> >
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> > >>>  at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
> > >>> at
> > >>>
> >
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:671)
> > >>>  at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1746)
> > >>> at
> > >>>
> >
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1112)
> > >>>  at
> > >>>
> >
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1108)
> > >>> at
> > >>>
> >
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> > >>>  at
> > >>>
> >
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1108)
> > >>> at
> > >>>
> >
> org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.scala:111)
> > >>>  at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
> > >>> at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
> > >>>  at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
> > >>> at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
> > >>>  at org.apache.samza.job.JobRunner.main(JobRunner.scala)
> > >>> Caused by: java.net.ConnectException: Connection refused
> > >>>  at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> > >>> at
> > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
> > >>>  at
> > >>>
> >
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
> > >>> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
> > >>>  at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
> > >>> at
> > >>>
> > org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:601)
> > >>>  at
> > >>>
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:696)
> > >>> at
> org.apache.hadoop.ipc.Client$Connection.access$2700(Client.java:367)
> > >>>  at org.apache.hadoop.ipc.Client.getConnection(Client.java:1458)
> > >>> at org.apache.hadoop.ipc.Client.call(Client.java:1377)
> > >>>  ... 22 more
> > >>>
> > >>>
> > >>>
> > >>> On Tue, Aug 12, 2014 at 3:39 AM, Yan Fang <ya...@gmail.com>
> > wrote:
> > >>>
> > >>>> Hi Telles,
> > >>>>
> > >>>> I think you put the wrong port. Usually, the HDFS port is 8020, not
> > >>>> 50070.
> > >>>> You should put something like:
> > >>>>
> *hdfs://telles**-samza-master:8020*/path/to/samza-job-package.taz.gz.
> > >>>> Thanks.
> > >>>>
> > >>>> Fang, Yan
> > >>>> yanfang724@gmail.com
> > >>>> +1 (206) 849-4108
> > >>>>
> > >>>>
> > >>>> On Mon, Aug 11, 2014 at 8:31 PM, Telles Nobrega <
> > >>>> tellesnobrega@gmail.com>
> > >>>> wrote:
> > >>>>
> > >>>> > I tried moving from HDFS to HttpFileSystem. I’m getting the
> > >>>> HttpFileSystem
> > >>>> > not found exception. I have done the steps in the tutorial that
> > Chris
> > >>>> > pasted below (I had done that before, but I’m not sure what is the
> > >>>> > problem). Seems like since I have the compiled file in one machine
> > >>>> > (resource manager) and I submit it and try to download from the
> node
> > >>>> > managers, they don’t have samza-yarn.jar (don’t know how to
> include
> > >>>> it,
> > >>>> > since the run will be done in the resource manager).
> > >>>> >
> > >>>> > Can you give me a tip on how to solve this?
> > >>>> >
> > >>>> > Thanks in advance.
> > >>>> >
> > >>>> > ps. the folder and tar.gz of the job are located in one machine
> > >>>> alone, is
> > >>>> > that the right way to do it or do I need to replicate hello-samza
> in
> > >>>> all
> > >>>> > machines to run it?
> > >>>> > On 11 Aug 2014, at 23:12, Telles Nobrega <tellesnobrega@gmail.com
> >
> > >>>> wrote:
> > >>>> >
> > >>>> > > What is your suggestion here, should I keep going on this quest
> to
> > >>>> fix
> > >>>> > hdfs or should I try to run using HttpFileSystem?
> > >>>> > > On 11 Aug 2014, at 23:01, Telles Nobrega <
> tellesnobrega@gmail.com
> > >
> > >>>> > wrote:
> > >>>> > >
> > >>>> > >> The port is right?? 50700. I have no idea what is happening
> now.
> > >>>> > >>
> > >>>> > >> On 11 Aug 2014, at 22:33, Telles Nobrega <
> > tellesnobrega@gmail.com>
> > >>>> > wrote:
> > >>>> > >>
> > >>>> > >>> Right now the error is the following:
> > >>>> > >>> Exception in thread "main" java.io.IOException: Failed on
> local
> > >>>> > exception: com.google.protobuf.InvalidProtocolBufferException:
> > >>>> Protocol
> > >>>> > message end-group tag did not match expected tag.; Host Details :
> > >>>> local
> > >>>> > host is: "telles-samza-master/10.1.0.79"; destination host is:
> > >>>> > "telles-samza-master":50070;
> > >>>> > >>>     at
> > >>>> org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
> > >>>> > >>>     at org.apache.hadoop.ipc.Client.call(Client.java:1410)
> > >>>> > >>>     at org.apache.hadoop.ipc.Client.call(Client.java:1359)
> > >>>> > >>>     at
> > >>>> >
> > >>>>
> >
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> > >>>> > >>>     at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
> > >>>> > >>>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> > Method)
> > >>>> > >>>     at
> > >>>> >
> > >>>>
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> > >>>> > >>>     at
> > >>>> >
> > >>>>
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > >>>> > >>>     at java.lang.reflect.Method.invoke(Method.java:606)
> > >>>> > >>>     at
> > >>>> >
> > >>>>
> >
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
> > >>>> > >>>     at
> > >>>> >
> > >>>>
> >
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> > >>>> > >>>     at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
> > >>>> > >>>     at
> > >>>> >
> > >>>>
> >
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:671)
> > >>>> > >>>     at
> > >>>> > org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1746)
> > >>>> > >>>     at
> > >>>> >
> > >>>>
> >
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1112)
> > >>>> > >>>     at
> > >>>> >
> > >>>>
> >
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1108)
> > >>>> > >>>     at
> > >>>> >
> > >>>>
> >
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> > >>>> > >>>     at
> > >>>> >
> > >>>>
> >
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1108)
> > >>>> > >>>     at
> > >>>> >
> > >>>>
> >
> org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.scala:111)
> > >>>> > >>>     at
> > org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
> > >>>> > >>>     at
> > org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
> > >>>> > >>>     at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
> > >>>> > >>>     at
> org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
> > >>>> > >>>     at org.apache.samza.job.JobRunner.main(JobRunner.scala)
> > >>>> > >>> Caused by: com.google.protobuf.InvalidProtocolBufferException:
> > >>>> > Protocol message end-group tag did not match expected tag.
> > >>>> > >>>     at
> > >>>> >
> > >>>>
> >
> com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:94)
> > >>>> > >>>     at
> > >>>> >
> > >>>>
> >
> com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
> > >>>> > >>>     at
> > >>>> >
> > >>>>
> >
> com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:202)
> > >>>> > >>>     at
> > >>>> >
> > >>>>
> >
> com.google.protobuf.AbstractParser.parsePartialDelimitedFrom(AbstractParser.java:241)
> > >>>> > >>>     at
> > >>>> >
> > >>>>
> >
> com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:253)
> > >>>> > >>>     at
> > >>>> >
> > >>>>
> >
> com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:259)
> > >>>> > >>>     at
> > >>>> >
> > >>>>
> >
> com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:49)
> > >>>> > >>>     at
> > >>>> >
> > >>>>
> >
> org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcHeaderProtos.java:2364)
> > >>>> > >>>     at
> > >>>> >
> > >>>>
> >
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1051)
> > >>>> > >>>     at
> > >>>> org.apache.hadoop.ipc.Client$Connection.run(Client.java:945)
> > >>>> > >>>
> > >>>> > >>> I feel that I’m close to making it run. Thanks for the help in
> > >>>> advance.
> > >>>> > >>> On 11 Aug 2014, at 22:06, Telles Nobrega <
> > tellesnobrega@gmail.com
> > >>>> >
> > >>>> > wrote:
> > >>>> > >>>
> > >>>> > >>>> Hi, I downloaded hadoop-common-2.3.0.jar and it worked
> better.
> > >>>> Now
> > >>>> > I’m having a configuration problem with my host, but it looks like
> > >>>> the hdfs
> > >>>> > is not a problem anymore.
> > >>>> > >>>>
> > >>>> > >>>>
> > >>>> > >>>>
> > >>>> > >>>>
> > >>>> > >>>> On 11 Aug 2014, at 22:04, Telles Nobrega <
> > >>>> tellesnobrega@gmail.com>
> > >>>> > wrote:
> > >>>> > >>>>
> > >>>> > >>>>> So, I added hadoop-hdfs-2.3.0.jar as a maven dependency.
> > >>>> Recompiled
> > >>>> > the project, extracted to deploy/samza and there problem still
> > >>>> happens. I
> > >>>> > downloaded hadoop-client-2.3.0.jar and the problems still happens,
> > >>>> > hadoop-common is 2.2.0 does this is a problem? I will try with
> 2.3.0
> > >>>> > >>>>>
> > >>>> > >>>>> Actually a lot of hadoop jars are 2.2.0
> > >>>> > >>>>>
> > >>>> > >>>>> On 11 Aug 2014, at 21:33, Yan Fang <ya...@gmail.com>
> > >>>> wrote:
> > >>>> > >>>>>
> > >>>> > >>>>>> <include>org.apache.hadoop:hadoop-hdfs</include>
> > >>>> > >>>>>
> > >>>> > >>>>
> > >>>> > >>>
> > >>>> > >>
> > >>>> > >
> > >>>> >
> > >>>> >
> > >>>>
> > >>>
> > >>>
> > >>>
> > >>> --
> > >>> ------------------------------------------
> > >>> Telles Mota Vidal Nobrega
> > >>> M.sc. Candidate at UFCG
> > >>> B.sc. in Computer Science at UFCG
> > >>> Software Engineer at OpenStack Project - HP/LSD-UFCG
> > >>>
> > >>
> > >>
> > >>
> > >> --
> > >> ------------------------------------------
> > >> Telles Mota Vidal Nobrega
> > >> M.sc. Candidate at UFCG
> > >> B.sc. in Computer Science at UFCG
> > >> Software Engineer at OpenStack Project - HP/LSD-UFCG
> > >>
> > >
> > >
> > >
> > > --
> > > ------------------------------------------
> > > Telles Mota Vidal Nobrega
> > > M.sc. Candidate at UFCG
> > > B.sc. in Computer Science at UFCG
> > > Software Engineer at OpenStack Project - HP/LSD-UFCG
> > >
> >
> >
> >
> > --
> > ------------------------------------------
> > Telles Mota Vidal Nobrega
> > M.sc. Candidate at UFCG
> > B.sc. in Computer Science at UFCG
> > Software Engineer at OpenStack Project - HP/LSD-UFCG
> >
>



-- 
------------------------------------------
Telles Mota Vidal Nobrega
M.sc. Candidate at UFCG
B.sc. in Computer Science at UFCG
Software Engineer at OpenStack Project - HP/LSD-UFCG

Re: Running Job on Multinode Yarn Cluster

Posted by Yan Fang <ya...@gmail.com>.
Yes, tar.gz should have all the necessary libs. If this error does not pop
up when you run "run-job", my guess is that you may forget to reupload the
tar.gz package after you recompile.

Fang, Yan
yanfang724@gmail.com
+1 (206) 849-4108


On Tue, Aug 12, 2014 at 6:34 AM, Telles Nobrega <te...@gmail.com>
wrote:

> What is the expected behavior here. The tar.gz file is in hdfs, it should
> find all necessary libs in the tar.gz right?
>
>
> On Tue, Aug 12, 2014 at 10:19 AM, Telles Nobrega <te...@gmail.com>
> wrote:
>
> > Chris and Yan,
> >
> > I was able to run the job but I got the error:
> >
> > Exception in thread "main" java.util.ServiceConfigurationError:
> > org.apache.hadoop.fs.FileSystem: Provider
> > org.apache.hadoop.hdfs.DistributedFileSystem could not be instantiated
> >         at java.util.ServiceLoader.fail(ServiceLoader.java:224)
> >         at java.util.ServiceLoader.access$100(ServiceLoader.java:181)
> >         at
> > java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:377)
> >         at java.util.ServiceLoader$1.next(ServiceLoader.java:445)
> >         at
> > org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2400)
> >         at
> > org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2411)
> >         at
> > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428)
> >         at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
> >         at
> > org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
> >         at
> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
> >         at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
> >         at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
> >         at
> >
> org.apache.samza.job.yarn.SamzaAppMasterTaskManager.startContainer(SamzaAppMasterTaskManager.scala:278)
> >          at
> >
> org.apache.samza.job.yarn.SamzaAppMasterTaskManager.onContainerAllocated(SamzaAppMasterTaskManager.scala:126)
> >         at
> >
> org.apache.samza.job.yarn.YarnAppMaster$$anonfun$run$8$$anonfun$apply$2.apply(YarnAppMaster.scala:66)
> >         at
> >
> org.apache.samza.job.yarn.YarnAppMaster$$anonfun$run$8$$anonfun$apply$2.apply(YarnAppMaster.scala:66)
> >         at scala.collection.immutable.List.foreach(List.scala:318)
> >         at
> >
> org.apache.samza.job.yarn.YarnAppMaster$$anonfun$run$8.apply(YarnAppMaster.scala:66)
> >         at
> >
> org.apache.samza.job.yarn.YarnAppMaster$$anonfun$run$8.apply(YarnAppMaster.scala:66)
> >         at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> >         at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> >         at
> > scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
> >         at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
> >         at
> > org.apache.samza.job.yarn.YarnAppMaster.run(YarnAppMaster.scala:66)
> >         at
> > org.apache.samza.job.yarn.SamzaAppMaster$.main(SamzaAppMaster.scala:81)
> >         at
> > org.apache.samza.job.yarn.SamzaAppMaster.main(SamzaAppMaster.scala)
> >  Caused by: java.lang.NoClassDefFoundError:
> > org/apache/hadoop/conf/Configuration$DeprecationDelta
> >         at
> >
> org.apache.hadoop.hdfs.HdfsConfiguration.addDeprecatedKeys(HdfsConfiguration.java:66)
> >         at
> >
> org.apache.hadoop.hdfs.HdfsConfiguration.<clinit>(HdfsConfiguration.java:31)
> >         at
> >
> org.apache.hadoop.hdfs.DistributedFileSystem.<clinit>(DistributedFileSystem.java:106)
> >         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> > Method)
> >         at
> >
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> >         at
> >
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> >         at
> java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> >         at java.lang.Class.newInstance(Class.java:374)
> >         at
> > java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:373)
> >          ... 23 more
> > Caused by: java.lang.ClassNotFoundException:
> > org.apache.hadoop.conf.Configuration$DeprecationDelta
> >         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> >         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> >         at java.security.AccessController.doPrivileged(Native Method)
> >         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> >         at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> >         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> >         at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> >         ... 32 more
> >
> > In the machine that is running the job. Do I need to put the jar files
> > there too? and where?
> >
> > Thanks
> >
> >
> > On Tue, Aug 12, 2014 at 9:17 AM, Telles Nobrega <tellesnobrega@gmail.com
> >
> > wrote:
> >
> >> Sorry for bothering this much.
> >>
> >>
> >> On Tue, Aug 12, 2014 at 9:17 AM, Telles Nobrega <
> tellesnobrega@gmail.com>
> >> wrote:
> >>
> >>> Now I have this error:
> >>>
> >>> Exception in thread "main" java.net.ConnectException: Call From
> >>> telles-samza-master/10.1.0.79 to telles-samza-master:8020 failed on
> >>> connection exception: java.net.ConnectException: Connection refused;
> For
> >>> more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
> >>>  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> >>> Method)
> >>> at
> >>>
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> >>>  at
> >>>
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> >>> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> >>>  at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783)
> >>> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730)
> >>>  at org.apache.hadoop.ipc.Client.call(Client.java:1410)
> >>> at org.apache.hadoop.ipc.Client.call(Client.java:1359)
> >>>  at
> >>>
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> >>> at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
> >>>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >>> at
> >>>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> >>>  at
> >>>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >>> at java.lang.reflect.Method.invoke(Method.java:606)
> >>>  at
> >>>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
> >>> at
> >>>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> >>>  at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
> >>> at
> >>>
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:671)
> >>>  at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1746)
> >>> at
> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1112)
> >>>  at
> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1108)
> >>> at
> >>>
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> >>>  at
> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1108)
> >>> at
> >>>
> org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.scala:111)
> >>>  at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
> >>> at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
> >>>  at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
> >>> at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
> >>>  at org.apache.samza.job.JobRunner.main(JobRunner.scala)
> >>> Caused by: java.net.ConnectException: Connection refused
> >>>  at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> >>> at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
> >>>  at
> >>>
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
> >>> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
> >>>  at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
> >>> at
> >>>
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:601)
> >>>  at
> >>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:696)
> >>> at org.apache.hadoop.ipc.Client$Connection.access$2700(Client.java:367)
> >>>  at org.apache.hadoop.ipc.Client.getConnection(Client.java:1458)
> >>> at org.apache.hadoop.ipc.Client.call(Client.java:1377)
> >>>  ... 22 more
> >>>
> >>>
> >>>
> >>> On Tue, Aug 12, 2014 at 3:39 AM, Yan Fang <ya...@gmail.com>
> wrote:
> >>>
> >>>> Hi Telles,
> >>>>
> >>>> I think you put the wrong port. Usually, the HDFS port is 8020, not
> >>>> 50070.
> >>>> You should put something like:
> >>>> *hdfs://telles**-samza-master:8020*/path/to/samza-job-package.taz.gz.
> >>>> Thanks.
> >>>>
> >>>> Fang, Yan
> >>>> yanfang724@gmail.com
> >>>> +1 (206) 849-4108
> >>>>
> >>>>
> >>>> On Mon, Aug 11, 2014 at 8:31 PM, Telles Nobrega <
> >>>> tellesnobrega@gmail.com>
> >>>> wrote:
> >>>>
> >>>> > I tried moving from HDFS to HttpFileSystem. I’m getting the
> >>>> HttpFileSystem
> >>>> > not found exception. I have done the steps in the tutorial that
> Chris
> >>>> > pasted below (I had done that before, but I’m not sure what is the
> >>>> > problem). Seems like since I have the compiled file in one machine
> >>>> > (resource manager) and I submit it and try to download from the node
> >>>> > managers, they don’t have samza-yarn.jar (don’t know how to include
> >>>> it,
> >>>> > since the run will be done in the resource manager).
> >>>> >
> >>>> > Can you give me a tip on how to solve this?
> >>>> >
> >>>> > Thanks in advance.
> >>>> >
> >>>> > ps. the folder and tar.gz of the job are located in one machine
> >>>> alone, is
> >>>> > that the right way to do it or do I need to replicate hello-samza in
> >>>> all
> >>>> > machines to run it?
> >>>> > On 11 Aug 2014, at 23:12, Telles Nobrega <te...@gmail.com>
> >>>> wrote:
> >>>> >
> >>>> > > What is your suggestion here, should I keep going on this quest to
> >>>> fix
> >>>> > hdfs or should I try to run using HttpFileSystem?
> >>>> > > On 11 Aug 2014, at 23:01, Telles Nobrega <tellesnobrega@gmail.com
> >
> >>>> > wrote:
> >>>> > >
> >>>> > >> The port is right?? 50700. I have no idea what is happening now.
> >>>> > >>
> >>>> > >> On 11 Aug 2014, at 22:33, Telles Nobrega <
> tellesnobrega@gmail.com>
> >>>> > wrote:
> >>>> > >>
> >>>> > >>> Right now the error is the following:
> >>>> > >>> Exception in thread "main" java.io.IOException: Failed on local
> >>>> > exception: com.google.protobuf.InvalidProtocolBufferException:
> >>>> Protocol
> >>>> > message end-group tag did not match expected tag.; Host Details :
> >>>> local
> >>>> > host is: "telles-samza-master/10.1.0.79"; destination host is:
> >>>> > "telles-samza-master":50070;
> >>>> > >>>     at
> >>>> org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
> >>>> > >>>     at org.apache.hadoop.ipc.Client.call(Client.java:1410)
> >>>> > >>>     at org.apache.hadoop.ipc.Client.call(Client.java:1359)
> >>>> > >>>     at
> >>>> >
> >>>>
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> >>>> > >>>     at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
> >>>> > >>>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> >>>> > >>>     at
> >>>> >
> >>>>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> >>>> > >>>     at
> >>>> >
> >>>>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >>>> > >>>     at java.lang.reflect.Method.invoke(Method.java:606)
> >>>> > >>>     at
> >>>> >
> >>>>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
> >>>> > >>>     at
> >>>> >
> >>>>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> >>>> > >>>     at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
> >>>> > >>>     at
> >>>> >
> >>>>
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:671)
> >>>> > >>>     at
> >>>> > org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1746)
> >>>> > >>>     at
> >>>> >
> >>>>
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1112)
> >>>> > >>>     at
> >>>> >
> >>>>
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1108)
> >>>> > >>>     at
> >>>> >
> >>>>
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> >>>> > >>>     at
> >>>> >
> >>>>
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1108)
> >>>> > >>>     at
> >>>> >
> >>>>
> org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.scala:111)
> >>>> > >>>     at
> org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
> >>>> > >>>     at
> org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
> >>>> > >>>     at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
> >>>> > >>>     at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
> >>>> > >>>     at org.apache.samza.job.JobRunner.main(JobRunner.scala)
> >>>> > >>> Caused by: com.google.protobuf.InvalidProtocolBufferException:
> >>>> > Protocol message end-group tag did not match expected tag.
> >>>> > >>>     at
> >>>> >
> >>>>
> com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:94)
> >>>> > >>>     at
> >>>> >
> >>>>
> com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
> >>>> > >>>     at
> >>>> >
> >>>>
> com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:202)
> >>>> > >>>     at
> >>>> >
> >>>>
> com.google.protobuf.AbstractParser.parsePartialDelimitedFrom(AbstractParser.java:241)
> >>>> > >>>     at
> >>>> >
> >>>>
> com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:253)
> >>>> > >>>     at
> >>>> >
> >>>>
> com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:259)
> >>>> > >>>     at
> >>>> >
> >>>>
> com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:49)
> >>>> > >>>     at
> >>>> >
> >>>>
> org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcHeaderProtos.java:2364)
> >>>> > >>>     at
> >>>> >
> >>>>
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1051)
> >>>> > >>>     at
> >>>> org.apache.hadoop.ipc.Client$Connection.run(Client.java:945)
> >>>> > >>>
> >>>> > >>> I feel that I’m close to making it run. Thanks for the help in
> >>>> advance.
> >>>> > >>> On 11 Aug 2014, at 22:06, Telles Nobrega <
> tellesnobrega@gmail.com
> >>>> >
> >>>> > wrote:
> >>>> > >>>
> >>>> > >>>> Hi, I downloaded hadoop-common-2.3.0.jar and it worked better.
> >>>> Now
> >>>> > I’m having a configuration problem with my host, but it looks like
> >>>> the hdfs
> >>>> > is not a problem anymore.
> >>>> > >>>>
> >>>> > >>>>
> >>>> > >>>>
> >>>> > >>>>
> >>>> > >>>> On 11 Aug 2014, at 22:04, Telles Nobrega <
> >>>> tellesnobrega@gmail.com>
> >>>> > wrote:
> >>>> > >>>>
> >>>> > >>>>> So, I added hadoop-hdfs-2.3.0.jar as a maven dependency.
> >>>> Recompiled
> >>>> > the project, extracted to deploy/samza and there problem still
> >>>> happens. I
> >>>> > downloaded hadoop-client-2.3.0.jar and the problems still happens,
> >>>> > hadoop-common is 2.2.0 does this is a problem? I will try with 2.3.0
> >>>> > >>>>>
> >>>> > >>>>> Actually a lot of hadoop jars are 2.2.0
> >>>> > >>>>>
> >>>> > >>>>> On 11 Aug 2014, at 21:33, Yan Fang <ya...@gmail.com>
> >>>> wrote:
> >>>> > >>>>>
> >>>> > >>>>>> <include>org.apache.hadoop:hadoop-hdfs</include>
> >>>> > >>>>>
> >>>> > >>>>
> >>>> > >>>
> >>>> > >>
> >>>> > >
> >>>> >
> >>>> >
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> ------------------------------------------
> >>> Telles Mota Vidal Nobrega
> >>> M.sc. Candidate at UFCG
> >>> B.sc. in Computer Science at UFCG
> >>> Software Engineer at OpenStack Project - HP/LSD-UFCG
> >>>
> >>
> >>
> >>
> >> --
> >> ------------------------------------------
> >> Telles Mota Vidal Nobrega
> >> M.sc. Candidate at UFCG
> >> B.sc. in Computer Science at UFCG
> >> Software Engineer at OpenStack Project - HP/LSD-UFCG
> >>
> >
> >
> >
> > --
> > ------------------------------------------
> > Telles Mota Vidal Nobrega
> > M.sc. Candidate at UFCG
> > B.sc. in Computer Science at UFCG
> > Software Engineer at OpenStack Project - HP/LSD-UFCG
> >
>
>
>
> --
> ------------------------------------------
> Telles Mota Vidal Nobrega
> M.sc. Candidate at UFCG
> B.sc. in Computer Science at UFCG
> Software Engineer at OpenStack Project - HP/LSD-UFCG
>

Re: Running Job on Multinode Yarn Cluster

Posted by Telles Nobrega <te...@gmail.com>.
What is the expected behavior here. The tar.gz file is in hdfs, it should
find all necessary libs in the tar.gz right?


On Tue, Aug 12, 2014 at 10:19 AM, Telles Nobrega <te...@gmail.com>
wrote:

> Chris and Yan,
>
> I was able to run the job but I got the error:
>
> Exception in thread "main" java.util.ServiceConfigurationError:
> org.apache.hadoop.fs.FileSystem: Provider
> org.apache.hadoop.hdfs.DistributedFileSystem could not be instantiated
>         at java.util.ServiceLoader.fail(ServiceLoader.java:224)
>         at java.util.ServiceLoader.access$100(ServiceLoader.java:181)
>         at
> java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:377)
>         at java.util.ServiceLoader$1.next(ServiceLoader.java:445)
>         at
> org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2400)
>         at
> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2411)
>         at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428)
>         at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
>         at
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
>         at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
>         at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
>         at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
>         at
> org.apache.samza.job.yarn.SamzaAppMasterTaskManager.startContainer(SamzaAppMasterTaskManager.scala:278)
>          at
> org.apache.samza.job.yarn.SamzaAppMasterTaskManager.onContainerAllocated(SamzaAppMasterTaskManager.scala:126)
>         at
> org.apache.samza.job.yarn.YarnAppMaster$$anonfun$run$8$$anonfun$apply$2.apply(YarnAppMaster.scala:66)
>         at
> org.apache.samza.job.yarn.YarnAppMaster$$anonfun$run$8$$anonfun$apply$2.apply(YarnAppMaster.scala:66)
>         at scala.collection.immutable.List.foreach(List.scala:318)
>         at
> org.apache.samza.job.yarn.YarnAppMaster$$anonfun$run$8.apply(YarnAppMaster.scala:66)
>         at
> org.apache.samza.job.yarn.YarnAppMaster$$anonfun$run$8.apply(YarnAppMaster.scala:66)
>         at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>         at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>         at
> scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>         at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>         at
> org.apache.samza.job.yarn.YarnAppMaster.run(YarnAppMaster.scala:66)
>         at
> org.apache.samza.job.yarn.SamzaAppMaster$.main(SamzaAppMaster.scala:81)
>         at
> org.apache.samza.job.yarn.SamzaAppMaster.main(SamzaAppMaster.scala)
>  Caused by: java.lang.NoClassDefFoundError:
> org/apache/hadoop/conf/Configuration$DeprecationDelta
>         at
> org.apache.hadoop.hdfs.HdfsConfiguration.addDeprecatedKeys(HdfsConfiguration.java:66)
>         at
> org.apache.hadoop.hdfs.HdfsConfiguration.<clinit>(HdfsConfiguration.java:31)
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.<clinit>(DistributedFileSystem.java:106)
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
>         at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>         at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>         at java.lang.Class.newInstance(Class.java:374)
>         at
> java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:373)
>          ... 23 more
> Caused by: java.lang.ClassNotFoundException:
> org.apache.hadoop.conf.Configuration$DeprecationDelta
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>         ... 32 more
>
> In the machine that is running the job. Do I need to put the jar files
> there too? and where?
>
> Thanks
>
>
> On Tue, Aug 12, 2014 at 9:17 AM, Telles Nobrega <te...@gmail.com>
> wrote:
>
>> Sorry for bothering this much.
>>
>>
>> On Tue, Aug 12, 2014 at 9:17 AM, Telles Nobrega <te...@gmail.com>
>> wrote:
>>
>>> Now I have this error:
>>>
>>> Exception in thread "main" java.net.ConnectException: Call From
>>> telles-samza-master/10.1.0.79 to telles-samza-master:8020 failed on
>>> connection exception: java.net.ConnectException: Connection refused; For
>>> more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
>>>  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>>> Method)
>>> at
>>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>>>  at
>>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>>> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>>>  at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783)
>>> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730)
>>>  at org.apache.hadoop.ipc.Client.call(Client.java:1410)
>>> at org.apache.hadoop.ipc.Client.call(Client.java:1359)
>>>  at
>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>>> at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
>>>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>  at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>>  at
>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>>> at
>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>>>  at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
>>> at
>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:671)
>>>  at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1746)
>>> at
>>> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1112)
>>>  at
>>> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1108)
>>> at
>>> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>>>  at
>>> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1108)
>>> at
>>> org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.scala:111)
>>>  at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
>>> at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
>>>  at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
>>> at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
>>>  at org.apache.samza.job.JobRunner.main(JobRunner.scala)
>>> Caused by: java.net.ConnectException: Connection refused
>>>  at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
>>>  at
>>> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
>>>  at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
>>> at
>>> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:601)
>>>  at
>>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:696)
>>> at org.apache.hadoop.ipc.Client$Connection.access$2700(Client.java:367)
>>>  at org.apache.hadoop.ipc.Client.getConnection(Client.java:1458)
>>> at org.apache.hadoop.ipc.Client.call(Client.java:1377)
>>>  ... 22 more
>>>
>>>
>>>
>>> On Tue, Aug 12, 2014 at 3:39 AM, Yan Fang <ya...@gmail.com> wrote:
>>>
>>>> Hi Telles,
>>>>
>>>> I think you put the wrong port. Usually, the HDFS port is 8020, not
>>>> 50070.
>>>> You should put something like:
>>>> *hdfs://telles**-samza-master:8020*/path/to/samza-job-package.taz.gz.
>>>> Thanks.
>>>>
>>>> Fang, Yan
>>>> yanfang724@gmail.com
>>>> +1 (206) 849-4108
>>>>
>>>>
>>>> On Mon, Aug 11, 2014 at 8:31 PM, Telles Nobrega <
>>>> tellesnobrega@gmail.com>
>>>> wrote:
>>>>
>>>> > I tried moving from HDFS to HttpFileSystem. I’m getting the
>>>> HttpFileSystem
>>>> > not found exception. I have done the steps in the tutorial that Chris
>>>> > pasted below (I had done that before, but I’m not sure what is the
>>>> > problem). Seems like since I have the compiled file in one machine
>>>> > (resource manager) and I submit it and try to download from the node
>>>> > managers, they don’t have samza-yarn.jar (don’t know how to include
>>>> it,
>>>> > since the run will be done in the resource manager).
>>>> >
>>>> > Can you give me a tip on how to solve this?
>>>> >
>>>> > Thanks in advance.
>>>> >
>>>> > ps. the folder and tar.gz of the job are located in one machine
>>>> alone, is
>>>> > that the right way to do it or do I need to replicate hello-samza in
>>>> all
>>>> > machines to run it?
>>>> > On 11 Aug 2014, at 23:12, Telles Nobrega <te...@gmail.com>
>>>> wrote:
>>>> >
>>>> > > What is your suggestion here, should I keep going on this quest to
>>>> fix
>>>> > hdfs or should I try to run using HttpFileSystem?
>>>> > > On 11 Aug 2014, at 23:01, Telles Nobrega <te...@gmail.com>
>>>> > wrote:
>>>> > >
>>>> > >> The port is right?? 50700. I have no idea what is happening now.
>>>> > >>
>>>> > >> On 11 Aug 2014, at 22:33, Telles Nobrega <te...@gmail.com>
>>>> > wrote:
>>>> > >>
>>>> > >>> Right now the error is the following:
>>>> > >>> Exception in thread "main" java.io.IOException: Failed on local
>>>> > exception: com.google.protobuf.InvalidProtocolBufferException:
>>>> Protocol
>>>> > message end-group tag did not match expected tag.; Host Details :
>>>> local
>>>> > host is: "telles-samza-master/10.1.0.79"; destination host is:
>>>> > "telles-samza-master":50070;
>>>> > >>>     at
>>>> org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
>>>> > >>>     at org.apache.hadoop.ipc.Client.call(Client.java:1410)
>>>> > >>>     at org.apache.hadoop.ipc.Client.call(Client.java:1359)
>>>> > >>>     at
>>>> >
>>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>>>> > >>>     at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
>>>> > >>>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>> > >>>     at
>>>> >
>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>> > >>>     at
>>>> >
>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>> > >>>     at java.lang.reflect.Method.invoke(Method.java:606)
>>>> > >>>     at
>>>> >
>>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>>>> > >>>     at
>>>> >
>>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>>>> > >>>     at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
>>>> > >>>     at
>>>> >
>>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:671)
>>>> > >>>     at
>>>> > org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1746)
>>>> > >>>     at
>>>> >
>>>> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1112)
>>>> > >>>     at
>>>> >
>>>> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1108)
>>>> > >>>     at
>>>> >
>>>> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>>>> > >>>     at
>>>> >
>>>> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1108)
>>>> > >>>     at
>>>> >
>>>> org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.scala:111)
>>>> > >>>     at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
>>>> > >>>     at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
>>>> > >>>     at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
>>>> > >>>     at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
>>>> > >>>     at org.apache.samza.job.JobRunner.main(JobRunner.scala)
>>>> > >>> Caused by: com.google.protobuf.InvalidProtocolBufferException:
>>>> > Protocol message end-group tag did not match expected tag.
>>>> > >>>     at
>>>> >
>>>> com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:94)
>>>> > >>>     at
>>>> >
>>>> com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
>>>> > >>>     at
>>>> >
>>>> com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:202)
>>>> > >>>     at
>>>> >
>>>> com.google.protobuf.AbstractParser.parsePartialDelimitedFrom(AbstractParser.java:241)
>>>> > >>>     at
>>>> >
>>>> com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:253)
>>>> > >>>     at
>>>> >
>>>> com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:259)
>>>> > >>>     at
>>>> >
>>>> com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:49)
>>>> > >>>     at
>>>> >
>>>> org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcHeaderProtos.java:2364)
>>>> > >>>     at
>>>> >
>>>> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1051)
>>>> > >>>     at
>>>> org.apache.hadoop.ipc.Client$Connection.run(Client.java:945)
>>>> > >>>
>>>> > >>> I feel that I’m close to making it run. Thanks for the help in
>>>> advance.
>>>> > >>> On 11 Aug 2014, at 22:06, Telles Nobrega <tellesnobrega@gmail.com
>>>> >
>>>> > wrote:
>>>> > >>>
>>>> > >>>> Hi, I downloaded hadoop-common-2.3.0.jar and it worked better.
>>>> Now
>>>> > I’m having a configuration problem with my host, but it looks like
>>>> the hdfs
>>>> > is not a problem anymore.
>>>> > >>>>
>>>> > >>>>
>>>> > >>>>
>>>> > >>>>
>>>> > >>>> On 11 Aug 2014, at 22:04, Telles Nobrega <
>>>> tellesnobrega@gmail.com>
>>>> > wrote:
>>>> > >>>>
>>>> > >>>>> So, I added hadoop-hdfs-2.3.0.jar as a maven dependency.
>>>> Recompiled
>>>> > the project, extracted to deploy/samza and there problem still
>>>> happens. I
>>>> > downloaded hadoop-client-2.3.0.jar and the problems still happens,
>>>> > hadoop-common is 2.2.0 does this is a problem? I will try with 2.3.0
>>>> > >>>>>
>>>> > >>>>> Actually a lot of hadoop jars are 2.2.0
>>>> > >>>>>
>>>> > >>>>> On 11 Aug 2014, at 21:33, Yan Fang <ya...@gmail.com>
>>>> wrote:
>>>> > >>>>>
>>>> > >>>>>> <include>org.apache.hadoop:hadoop-hdfs</include>
>>>> > >>>>>
>>>> > >>>>
>>>> > >>>
>>>> > >>
>>>> > >
>>>> >
>>>> >
>>>>
>>>
>>>
>>>
>>> --
>>> ------------------------------------------
>>> Telles Mota Vidal Nobrega
>>> M.sc. Candidate at UFCG
>>> B.sc. in Computer Science at UFCG
>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
>>>
>>
>>
>>
>> --
>> ------------------------------------------
>> Telles Mota Vidal Nobrega
>> M.sc. Candidate at UFCG
>> B.sc. in Computer Science at UFCG
>> Software Engineer at OpenStack Project - HP/LSD-UFCG
>>
>
>
>
> --
> ------------------------------------------
> Telles Mota Vidal Nobrega
> M.sc. Candidate at UFCG
> B.sc. in Computer Science at UFCG
> Software Engineer at OpenStack Project - HP/LSD-UFCG
>



-- 
------------------------------------------
Telles Mota Vidal Nobrega
M.sc. Candidate at UFCG
B.sc. in Computer Science at UFCG
Software Engineer at OpenStack Project - HP/LSD-UFCG

Re: Running Job on Multinode Yarn Cluster

Posted by Telles Nobrega <te...@gmail.com>.
Chris and Yan,

I was able to run the job but I got the error:

Exception in thread "main" java.util.ServiceConfigurationError:
org.apache.hadoop.fs.FileSystem: Provider
org.apache.hadoop.hdfs.DistributedFileSystem could not be instantiated
        at java.util.ServiceLoader.fail(ServiceLoader.java:224)
        at java.util.ServiceLoader.access$100(ServiceLoader.java:181)
        at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:377)
        at java.util.ServiceLoader$1.next(ServiceLoader.java:445)
        at
org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2400)
        at
org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2411)
        at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428)
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
        at
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
        at
org.apache.samza.job.yarn.SamzaAppMasterTaskManager.startContainer(SamzaAppMasterTaskManager.scala:278)
        at
org.apache.samza.job.yarn.SamzaAppMasterTaskManager.onContainerAllocated(SamzaAppMasterTaskManager.scala:126)
        at
org.apache.samza.job.yarn.YarnAppMaster$$anonfun$run$8$$anonfun$apply$2.apply(YarnAppMaster.scala:66)
        at
org.apache.samza.job.yarn.YarnAppMaster$$anonfun$run$8$$anonfun$apply$2.apply(YarnAppMaster.scala:66)
        at scala.collection.immutable.List.foreach(List.scala:318)
        at
org.apache.samza.job.yarn.YarnAppMaster$$anonfun$run$8.apply(YarnAppMaster.scala:66)
        at
org.apache.samza.job.yarn.YarnAppMaster$$anonfun$run$8.apply(YarnAppMaster.scala:66)
        at scala.collection.Iterator$class.foreach(Iterator.scala:727)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
        at
scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
        at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
        at
org.apache.samza.job.yarn.YarnAppMaster.run(YarnAppMaster.scala:66)
        at
org.apache.samza.job.yarn.SamzaAppMaster$.main(SamzaAppMaster.scala:81)
        at
org.apache.samza.job.yarn.SamzaAppMaster.main(SamzaAppMaster.scala)
Caused by: java.lang.NoClassDefFoundError:
org/apache/hadoop/conf/Configuration$DeprecationDelta
        at
org.apache.hadoop.hdfs.HdfsConfiguration.addDeprecatedKeys(HdfsConfiguration.java:66)
        at
org.apache.hadoop.hdfs.HdfsConfiguration.<clinit>(HdfsConfiguration.java:31)
        at
org.apache.hadoop.hdfs.DistributedFileSystem.<clinit>(DistributedFileSystem.java:106)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
        at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
        at java.lang.Class.newInstance(Class.java:374)
        at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:373)
        ... 23 more
Caused by: java.lang.ClassNotFoundException:
org.apache.hadoop.conf.Configuration$DeprecationDelta
        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
        ... 32 more

In the machine that is running the job. Do I need to put the jar files
there too? and where?

Thanks


On Tue, Aug 12, 2014 at 9:17 AM, Telles Nobrega <te...@gmail.com>
wrote:

> Sorry for bothering this much.
>
>
> On Tue, Aug 12, 2014 at 9:17 AM, Telles Nobrega <te...@gmail.com>
> wrote:
>
>> Now I have this error:
>>
>> Exception in thread "main" java.net.ConnectException: Call From
>> telles-samza-master/10.1.0.79 to telles-samza-master:8020 failed on
>> connection exception: java.net.ConnectException: Connection refused; For
>> more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
>>  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>> at
>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>>  at
>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>>  at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783)
>> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730)
>>  at org.apache.hadoop.ipc.Client.call(Client.java:1410)
>> at org.apache.hadoop.ipc.Client.call(Client.java:1359)
>>  at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>> at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
>>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>  at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:606)
>>  at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>> at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>>  at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
>> at
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:671)
>>  at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1746)
>> at
>> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1112)
>>  at
>> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1108)
>> at
>> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>>  at
>> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1108)
>> at
>> org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.scala:111)
>>  at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
>> at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
>>  at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
>> at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
>>  at org.apache.samza.job.JobRunner.main(JobRunner.scala)
>> Caused by: java.net.ConnectException: Connection refused
>>  at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
>>  at
>> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
>>  at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
>> at
>> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:601)
>>  at
>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:696)
>> at org.apache.hadoop.ipc.Client$Connection.access$2700(Client.java:367)
>>  at org.apache.hadoop.ipc.Client.getConnection(Client.java:1458)
>> at org.apache.hadoop.ipc.Client.call(Client.java:1377)
>>  ... 22 more
>>
>>
>>
>> On Tue, Aug 12, 2014 at 3:39 AM, Yan Fang <ya...@gmail.com> wrote:
>>
>>> Hi Telles,
>>>
>>> I think you put the wrong port. Usually, the HDFS port is 8020, not
>>> 50070.
>>> You should put something like:
>>> *hdfs://telles**-samza-master:8020*/path/to/samza-job-package.taz.gz.
>>> Thanks.
>>>
>>> Fang, Yan
>>> yanfang724@gmail.com
>>> +1 (206) 849-4108
>>>
>>>
>>> On Mon, Aug 11, 2014 at 8:31 PM, Telles Nobrega <tellesnobrega@gmail.com
>>> >
>>> wrote:
>>>
>>> > I tried moving from HDFS to HttpFileSystem. I’m getting the
>>> HttpFileSystem
>>> > not found exception. I have done the steps in the tutorial that Chris
>>> > pasted below (I had done that before, but I’m not sure what is the
>>> > problem). Seems like since I have the compiled file in one machine
>>> > (resource manager) and I submit it and try to download from the node
>>> > managers, they don’t have samza-yarn.jar (don’t know how to include it,
>>> > since the run will be done in the resource manager).
>>> >
>>> > Can you give me a tip on how to solve this?
>>> >
>>> > Thanks in advance.
>>> >
>>> > ps. the folder and tar.gz of the job are located in one machine alone,
>>> is
>>> > that the right way to do it or do I need to replicate hello-samza in
>>> all
>>> > machines to run it?
>>> > On 11 Aug 2014, at 23:12, Telles Nobrega <te...@gmail.com>
>>> wrote:
>>> >
>>> > > What is your suggestion here, should I keep going on this quest to
>>> fix
>>> > hdfs or should I try to run using HttpFileSystem?
>>> > > On 11 Aug 2014, at 23:01, Telles Nobrega <te...@gmail.com>
>>> > wrote:
>>> > >
>>> > >> The port is right?? 50700. I have no idea what is happening now.
>>> > >>
>>> > >> On 11 Aug 2014, at 22:33, Telles Nobrega <te...@gmail.com>
>>> > wrote:
>>> > >>
>>> > >>> Right now the error is the following:
>>> > >>> Exception in thread "main" java.io.IOException: Failed on local
>>> > exception: com.google.protobuf.InvalidProtocolBufferException: Protocol
>>> > message end-group tag did not match expected tag.; Host Details : local
>>> > host is: "telles-samza-master/10.1.0.79"; destination host is:
>>> > "telles-samza-master":50070;
>>> > >>>     at
>>> org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
>>> > >>>     at org.apache.hadoop.ipc.Client.call(Client.java:1410)
>>> > >>>     at org.apache.hadoop.ipc.Client.call(Client.java:1359)
>>> > >>>     at
>>> >
>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>>> > >>>     at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
>>> > >>>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> > >>>     at
>>> >
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>> > >>>     at
>>> >
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> > >>>     at java.lang.reflect.Method.invoke(Method.java:606)
>>> > >>>     at
>>> >
>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>>> > >>>     at
>>> >
>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>>> > >>>     at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
>>> > >>>     at
>>> >
>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:671)
>>> > >>>     at
>>> > org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1746)
>>> > >>>     at
>>> >
>>> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1112)
>>> > >>>     at
>>> >
>>> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1108)
>>> > >>>     at
>>> >
>>> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>>> > >>>     at
>>> >
>>> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1108)
>>> > >>>     at
>>> >
>>> org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.scala:111)
>>> > >>>     at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
>>> > >>>     at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
>>> > >>>     at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
>>> > >>>     at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
>>> > >>>     at org.apache.samza.job.JobRunner.main(JobRunner.scala)
>>> > >>> Caused by: com.google.protobuf.InvalidProtocolBufferException:
>>> > Protocol message end-group tag did not match expected tag.
>>> > >>>     at
>>> >
>>> com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:94)
>>> > >>>     at
>>> >
>>> com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
>>> > >>>     at
>>> >
>>> com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:202)
>>> > >>>     at
>>> >
>>> com.google.protobuf.AbstractParser.parsePartialDelimitedFrom(AbstractParser.java:241)
>>> > >>>     at
>>> >
>>> com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:253)
>>> > >>>     at
>>> >
>>> com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:259)
>>> > >>>     at
>>> >
>>> com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:49)
>>> > >>>     at
>>> >
>>> org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcHeaderProtos.java:2364)
>>> > >>>     at
>>> >
>>> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1051)
>>> > >>>     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:945)
>>> > >>>
>>> > >>> I feel that I’m close to making it run. Thanks for the help in
>>> advance.
>>> > >>> On 11 Aug 2014, at 22:06, Telles Nobrega <te...@gmail.com>
>>> > wrote:
>>> > >>>
>>> > >>>> Hi, I downloaded hadoop-common-2.3.0.jar and it worked better. Now
>>> > I’m having a configuration problem with my host, but it looks like the
>>> hdfs
>>> > is not a problem anymore.
>>> > >>>>
>>> > >>>>
>>> > >>>>
>>> > >>>>
>>> > >>>> On 11 Aug 2014, at 22:04, Telles Nobrega <tellesnobrega@gmail.com
>>> >
>>> > wrote:
>>> > >>>>
>>> > >>>>> So, I added hadoop-hdfs-2.3.0.jar as a maven dependency.
>>> Recompiled
>>> > the project, extracted to deploy/samza and there problem still
>>> happens. I
>>> > downloaded hadoop-client-2.3.0.jar and the problems still happens,
>>> > hadoop-common is 2.2.0 does this is a problem? I will try with 2.3.0
>>> > >>>>>
>>> > >>>>> Actually a lot of hadoop jars are 2.2.0
>>> > >>>>>
>>> > >>>>> On 11 Aug 2014, at 21:33, Yan Fang <ya...@gmail.com> wrote:
>>> > >>>>>
>>> > >>>>>> <include>org.apache.hadoop:hadoop-hdfs</include>
>>> > >>>>>
>>> > >>>>
>>> > >>>
>>> > >>
>>> > >
>>> >
>>> >
>>>
>>
>>
>>
>> --
>> ------------------------------------------
>> Telles Mota Vidal Nobrega
>> M.sc. Candidate at UFCG
>> B.sc. in Computer Science at UFCG
>> Software Engineer at OpenStack Project - HP/LSD-UFCG
>>
>
>
>
> --
> ------------------------------------------
> Telles Mota Vidal Nobrega
> M.sc. Candidate at UFCG
> B.sc. in Computer Science at UFCG
> Software Engineer at OpenStack Project - HP/LSD-UFCG
>



-- 
------------------------------------------
Telles Mota Vidal Nobrega
M.sc. Candidate at UFCG
B.sc. in Computer Science at UFCG
Software Engineer at OpenStack Project - HP/LSD-UFCG

Re: Running Job on Multinode Yarn Cluster

Posted by Telles Nobrega <te...@gmail.com>.
Sorry for bothering this much.


On Tue, Aug 12, 2014 at 9:17 AM, Telles Nobrega <te...@gmail.com>
wrote:

> Now I have this error:
>
> Exception in thread "main" java.net.ConnectException: Call From
> telles-samza-master/10.1.0.79 to telles-samza-master:8020 failed on
> connection exception: java.net.ConnectException: Connection refused; For
> more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
>  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>  at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>  at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783)
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730)
>  at org.apache.hadoop.ipc.Client.call(Client.java:1410)
> at org.apache.hadoop.ipc.Client.call(Client.java:1359)
>  at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>  at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
>  at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>  at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:671)
>  at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1746)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1112)
>  at
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1108)
> at
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>  at
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1108)
> at
> org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.scala:111)
>  at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
> at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
>  at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
> at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
>  at org.apache.samza.job.JobRunner.main(JobRunner.scala)
> Caused by: java.net.ConnectException: Connection refused
>  at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
>  at
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
>  at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
> at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:601)
>  at
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:696)
> at org.apache.hadoop.ipc.Client$Connection.access$2700(Client.java:367)
>  at org.apache.hadoop.ipc.Client.getConnection(Client.java:1458)
> at org.apache.hadoop.ipc.Client.call(Client.java:1377)
>  ... 22 more
>
>
>
> On Tue, Aug 12, 2014 at 3:39 AM, Yan Fang <ya...@gmail.com> wrote:
>
>> Hi Telles,
>>
>> I think you put the wrong port. Usually, the HDFS port is 8020, not 50070.
>> You should put something like:
>> *hdfs://telles**-samza-master:8020*/path/to/samza-job-package.taz.gz.
>> Thanks.
>>
>> Fang, Yan
>> yanfang724@gmail.com
>> +1 (206) 849-4108
>>
>>
>> On Mon, Aug 11, 2014 at 8:31 PM, Telles Nobrega <te...@gmail.com>
>> wrote:
>>
>> > I tried moving from HDFS to HttpFileSystem. I’m getting the
>> HttpFileSystem
>> > not found exception. I have done the steps in the tutorial that Chris
>> > pasted below (I had done that before, but I’m not sure what is the
>> > problem). Seems like since I have the compiled file in one machine
>> > (resource manager) and I submit it and try to download from the node
>> > managers, they don’t have samza-yarn.jar (don’t know how to include it,
>> > since the run will be done in the resource manager).
>> >
>> > Can you give me a tip on how to solve this?
>> >
>> > Thanks in advance.
>> >
>> > ps. the folder and tar.gz of the job are located in one machine alone,
>> is
>> > that the right way to do it or do I need to replicate hello-samza in all
>> > machines to run it?
>> > On 11 Aug 2014, at 23:12, Telles Nobrega <te...@gmail.com>
>> wrote:
>> >
>> > > What is your suggestion here, should I keep going on this quest to fix
>> > hdfs or should I try to run using HttpFileSystem?
>> > > On 11 Aug 2014, at 23:01, Telles Nobrega <te...@gmail.com>
>> > wrote:
>> > >
>> > >> The port is right?? 50700. I have no idea what is happening now.
>> > >>
>> > >> On 11 Aug 2014, at 22:33, Telles Nobrega <te...@gmail.com>
>> > wrote:
>> > >>
>> > >>> Right now the error is the following:
>> > >>> Exception in thread "main" java.io.IOException: Failed on local
>> > exception: com.google.protobuf.InvalidProtocolBufferException: Protocol
>> > message end-group tag did not match expected tag.; Host Details : local
>> > host is: "telles-samza-master/10.1.0.79"; destination host is:
>> > "telles-samza-master":50070;
>> > >>>     at
>> org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
>> > >>>     at org.apache.hadoop.ipc.Client.call(Client.java:1410)
>> > >>>     at org.apache.hadoop.ipc.Client.call(Client.java:1359)
>> > >>>     at
>> >
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>> > >>>     at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
>> > >>>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> > >>>     at
>> >
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> > >>>     at
>> >
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> > >>>     at java.lang.reflect.Method.invoke(Method.java:606)
>> > >>>     at
>> >
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>> > >>>     at
>> >
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>> > >>>     at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
>> > >>>     at
>> >
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:671)
>> > >>>     at
>> > org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1746)
>> > >>>     at
>> >
>> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1112)
>> > >>>     at
>> >
>> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1108)
>> > >>>     at
>> >
>> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>> > >>>     at
>> >
>> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1108)
>> > >>>     at
>> >
>> org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.scala:111)
>> > >>>     at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
>> > >>>     at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
>> > >>>     at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
>> > >>>     at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
>> > >>>     at org.apache.samza.job.JobRunner.main(JobRunner.scala)
>> > >>> Caused by: com.google.protobuf.InvalidProtocolBufferException:
>> > Protocol message end-group tag did not match expected tag.
>> > >>>     at
>> >
>> com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:94)
>> > >>>     at
>> >
>> com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
>> > >>>     at
>> >
>> com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:202)
>> > >>>     at
>> >
>> com.google.protobuf.AbstractParser.parsePartialDelimitedFrom(AbstractParser.java:241)
>> > >>>     at
>> >
>> com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:253)
>> > >>>     at
>> >
>> com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:259)
>> > >>>     at
>> >
>> com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:49)
>> > >>>     at
>> >
>> org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcHeaderProtos.java:2364)
>> > >>>     at
>> >
>> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1051)
>> > >>>     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:945)
>> > >>>
>> > >>> I feel that I’m close to making it run. Thanks for the help in
>> advance.
>> > >>> On 11 Aug 2014, at 22:06, Telles Nobrega <te...@gmail.com>
>> > wrote:
>> > >>>
>> > >>>> Hi, I downloaded hadoop-common-2.3.0.jar and it worked better. Now
>> > I’m having a configuration problem with my host, but it looks like the
>> hdfs
>> > is not a problem anymore.
>> > >>>>
>> > >>>>
>> > >>>>
>> > >>>>
>> > >>>> On 11 Aug 2014, at 22:04, Telles Nobrega <te...@gmail.com>
>> > wrote:
>> > >>>>
>> > >>>>> So, I added hadoop-hdfs-2.3.0.jar as a maven dependency.
>> Recompiled
>> > the project, extracted to deploy/samza and there problem still happens.
>> I
>> > downloaded hadoop-client-2.3.0.jar and the problems still happens,
>> > hadoop-common is 2.2.0 does this is a problem? I will try with 2.3.0
>> > >>>>>
>> > >>>>> Actually a lot of hadoop jars are 2.2.0
>> > >>>>>
>> > >>>>> On 11 Aug 2014, at 21:33, Yan Fang <ya...@gmail.com> wrote:
>> > >>>>>
>> > >>>>>> <include>org.apache.hadoop:hadoop-hdfs</include>
>> > >>>>>
>> > >>>>
>> > >>>
>> > >>
>> > >
>> >
>> >
>>
>
>
>
> --
> ------------------------------------------
> Telles Mota Vidal Nobrega
> M.sc. Candidate at UFCG
> B.sc. in Computer Science at UFCG
> Software Engineer at OpenStack Project - HP/LSD-UFCG
>



-- 
------------------------------------------
Telles Mota Vidal Nobrega
M.sc. Candidate at UFCG
B.sc. in Computer Science at UFCG
Software Engineer at OpenStack Project - HP/LSD-UFCG

Re: Running Job on Multinode Yarn Cluster

Posted by Telles Nobrega <te...@gmail.com>.
Now I have this error:

Exception in thread "main" java.net.ConnectException: Call From
telles-samza-master/10.1.0.79 to telles-samza-master:8020 failed on
connection exception: java.net.ConnectException: Connection refused; For
more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
 at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
 at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730)
 at org.apache.hadoop.ipc.Client.call(Client.java:1410)
at org.apache.hadoop.ipc.Client.call(Client.java:1359)
 at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
 at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
 at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:671)
 at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1746)
at
org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1112)
 at
org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1108)
at
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
 at
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1108)
at
org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.scala:111)
 at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
 at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
 at org.apache.samza.job.JobRunner.main(JobRunner.scala)
Caused by: java.net.ConnectException: Connection refused
 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
 at
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
 at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:601)
 at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:696)
at org.apache.hadoop.ipc.Client$Connection.access$2700(Client.java:367)
 at org.apache.hadoop.ipc.Client.getConnection(Client.java:1458)
at org.apache.hadoop.ipc.Client.call(Client.java:1377)
 ... 22 more



On Tue, Aug 12, 2014 at 3:39 AM, Yan Fang <ya...@gmail.com> wrote:

> Hi Telles,
>
> I think you put the wrong port. Usually, the HDFS port is 8020, not 50070.
> You should put something like:
> *hdfs://telles**-samza-master:8020*/path/to/samza-job-package.taz.gz.
> Thanks.
>
> Fang, Yan
> yanfang724@gmail.com
> +1 (206) 849-4108
>
>
> On Mon, Aug 11, 2014 at 8:31 PM, Telles Nobrega <te...@gmail.com>
> wrote:
>
> > I tried moving from HDFS to HttpFileSystem. I’m getting the
> HttpFileSystem
> > not found exception. I have done the steps in the tutorial that Chris
> > pasted below (I had done that before, but I’m not sure what is the
> > problem). Seems like since I have the compiled file in one machine
> > (resource manager) and I submit it and try to download from the node
> > managers, they don’t have samza-yarn.jar (don’t know how to include it,
> > since the run will be done in the resource manager).
> >
> > Can you give me a tip on how to solve this?
> >
> > Thanks in advance.
> >
> > ps. the folder and tar.gz of the job are located in one machine alone, is
> > that the right way to do it or do I need to replicate hello-samza in all
> > machines to run it?
> > On 11 Aug 2014, at 23:12, Telles Nobrega <te...@gmail.com>
> wrote:
> >
> > > What is your suggestion here, should I keep going on this quest to fix
> > hdfs or should I try to run using HttpFileSystem?
> > > On 11 Aug 2014, at 23:01, Telles Nobrega <te...@gmail.com>
> > wrote:
> > >
> > >> The port is right?? 50700. I have no idea what is happening now.
> > >>
> > >> On 11 Aug 2014, at 22:33, Telles Nobrega <te...@gmail.com>
> > wrote:
> > >>
> > >>> Right now the error is the following:
> > >>> Exception in thread "main" java.io.IOException: Failed on local
> > exception: com.google.protobuf.InvalidProtocolBufferException: Protocol
> > message end-group tag did not match expected tag.; Host Details : local
> > host is: "telles-samza-master/10.1.0.79"; destination host is:
> > "telles-samza-master":50070;
> > >>>     at
> org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
> > >>>     at org.apache.hadoop.ipc.Client.call(Client.java:1410)
> > >>>     at org.apache.hadoop.ipc.Client.call(Client.java:1359)
> > >>>     at
> >
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> > >>>     at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
> > >>>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >>>     at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> > >>>     at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > >>>     at java.lang.reflect.Method.invoke(Method.java:606)
> > >>>     at
> >
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
> > >>>     at
> >
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> > >>>     at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
> > >>>     at
> >
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:671)
> > >>>     at
> > org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1746)
> > >>>     at
> >
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1112)
> > >>>     at
> >
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1108)
> > >>>     at
> >
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> > >>>     at
> >
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1108)
> > >>>     at
> >
> org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.scala:111)
> > >>>     at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
> > >>>     at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
> > >>>     at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
> > >>>     at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
> > >>>     at org.apache.samza.job.JobRunner.main(JobRunner.scala)
> > >>> Caused by: com.google.protobuf.InvalidProtocolBufferException:
> > Protocol message end-group tag did not match expected tag.
> > >>>     at
> >
> com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:94)
> > >>>     at
> >
> com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
> > >>>     at
> >
> com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:202)
> > >>>     at
> >
> com.google.protobuf.AbstractParser.parsePartialDelimitedFrom(AbstractParser.java:241)
> > >>>     at
> >
> com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:253)
> > >>>     at
> >
> com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:259)
> > >>>     at
> >
> com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:49)
> > >>>     at
> >
> org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcHeaderProtos.java:2364)
> > >>>     at
> >
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1051)
> > >>>     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:945)
> > >>>
> > >>> I feel that I’m close to making it run. Thanks for the help in
> advance.
> > >>> On 11 Aug 2014, at 22:06, Telles Nobrega <te...@gmail.com>
> > wrote:
> > >>>
> > >>>> Hi, I downloaded hadoop-common-2.3.0.jar and it worked better. Now
> > I’m having a configuration problem with my host, but it looks like the
> hdfs
> > is not a problem anymore.
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> On 11 Aug 2014, at 22:04, Telles Nobrega <te...@gmail.com>
> > wrote:
> > >>>>
> > >>>>> So, I added hadoop-hdfs-2.3.0.jar as a maven dependency. Recompiled
> > the project, extracted to deploy/samza and there problem still happens. I
> > downloaded hadoop-client-2.3.0.jar and the problems still happens,
> > hadoop-common is 2.2.0 does this is a problem? I will try with 2.3.0
> > >>>>>
> > >>>>> Actually a lot of hadoop jars are 2.2.0
> > >>>>>
> > >>>>> On 11 Aug 2014, at 21:33, Yan Fang <ya...@gmail.com> wrote:
> > >>>>>
> > >>>>>> <include>org.apache.hadoop:hadoop-hdfs</include>
> > >>>>>
> > >>>>
> > >>>
> > >>
> > >
> >
> >
>



-- 
------------------------------------------
Telles Mota Vidal Nobrega
M.sc. Candidate at UFCG
B.sc. in Computer Science at UFCG
Software Engineer at OpenStack Project - HP/LSD-UFCG

Re: Running Job on Multinode Yarn Cluster

Posted by Yan Fang <ya...@gmail.com>.
Hi Telles,

I think you put the wrong port. Usually, the HDFS port is 8020, not 50070.
You should put something like:
*hdfs://telles**-samza-master:8020*/path/to/samza-job-package.taz.gz.
Thanks.

Fang, Yan
yanfang724@gmail.com
+1 (206) 849-4108


On Mon, Aug 11, 2014 at 8:31 PM, Telles Nobrega <te...@gmail.com>
wrote:

> I tried moving from HDFS to HttpFileSystem. I’m getting the HttpFileSystem
> not found exception. I have done the steps in the tutorial that Chris
> pasted below (I had done that before, but I’m not sure what is the
> problem). Seems like since I have the compiled file in one machine
> (resource manager) and I submit it and try to download from the node
> managers, they don’t have samza-yarn.jar (don’t know how to include it,
> since the run will be done in the resource manager).
>
> Can you give me a tip on how to solve this?
>
> Thanks in advance.
>
> ps. the folder and tar.gz of the job are located in one machine alone, is
> that the right way to do it or do I need to replicate hello-samza in all
> machines to run it?
> On 11 Aug 2014, at 23:12, Telles Nobrega <te...@gmail.com> wrote:
>
> > What is your suggestion here, should I keep going on this quest to fix
> hdfs or should I try to run using HttpFileSystem?
> > On 11 Aug 2014, at 23:01, Telles Nobrega <te...@gmail.com>
> wrote:
> >
> >> The port is right?? 50700. I have no idea what is happening now.
> >>
> >> On 11 Aug 2014, at 22:33, Telles Nobrega <te...@gmail.com>
> wrote:
> >>
> >>> Right now the error is the following:
> >>> Exception in thread "main" java.io.IOException: Failed on local
> exception: com.google.protobuf.InvalidProtocolBufferException: Protocol
> message end-group tag did not match expected tag.; Host Details : local
> host is: "telles-samza-master/10.1.0.79"; destination host is:
> "telles-samza-master":50070;
> >>>     at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
> >>>     at org.apache.hadoop.ipc.Client.call(Client.java:1410)
> >>>     at org.apache.hadoop.ipc.Client.call(Client.java:1359)
> >>>     at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> >>>     at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
> >>>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >>>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> >>>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >>>     at java.lang.reflect.Method.invoke(Method.java:606)
> >>>     at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
> >>>     at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> >>>     at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
> >>>     at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:671)
> >>>     at
> org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1746)
> >>>     at
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1112)
> >>>     at
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1108)
> >>>     at
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> >>>     at
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1108)
> >>>     at
> org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.scala:111)
> >>>     at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
> >>>     at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
> >>>     at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
> >>>     at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
> >>>     at org.apache.samza.job.JobRunner.main(JobRunner.scala)
> >>> Caused by: com.google.protobuf.InvalidProtocolBufferException:
> Protocol message end-group tag did not match expected tag.
> >>>     at
> com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:94)
> >>>     at
> com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
> >>>     at
> com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:202)
> >>>     at
> com.google.protobuf.AbstractParser.parsePartialDelimitedFrom(AbstractParser.java:241)
> >>>     at
> com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:253)
> >>>     at
> com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:259)
> >>>     at
> com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:49)
> >>>     at
> org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcHeaderProtos.java:2364)
> >>>     at
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1051)
> >>>     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:945)
> >>>
> >>> I feel that I’m close to making it run. Thanks for the help in advance.
> >>> On 11 Aug 2014, at 22:06, Telles Nobrega <te...@gmail.com>
> wrote:
> >>>
> >>>> Hi, I downloaded hadoop-common-2.3.0.jar and it worked better. Now
> I’m having a configuration problem with my host, but it looks like the hdfs
> is not a problem anymore.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On 11 Aug 2014, at 22:04, Telles Nobrega <te...@gmail.com>
> wrote:
> >>>>
> >>>>> So, I added hadoop-hdfs-2.3.0.jar as a maven dependency. Recompiled
> the project, extracted to deploy/samza and there problem still happens. I
> downloaded hadoop-client-2.3.0.jar and the problems still happens,
> hadoop-common is 2.2.0 does this is a problem? I will try with 2.3.0
> >>>>>
> >>>>> Actually a lot of hadoop jars are 2.2.0
> >>>>>
> >>>>> On 11 Aug 2014, at 21:33, Yan Fang <ya...@gmail.com> wrote:
> >>>>>
> >>>>>> <include>org.apache.hadoop:hadoop-hdfs</include>
> >>>>>
> >>>>
> >>>
> >>
> >
>
>

Re: Running Job on Multinode Yarn Cluster

Posted by Telles Nobrega <te...@gmail.com>.
I tried moving from HDFS to HttpFileSystem. I’m getting the HttpFileSystem not found exception. I have done the steps in the tutorial that Chris pasted below (I had done that before, but I’m not sure what is the problem). Seems like since I have the compiled file in one machine (resource manager) and I submit it and try to download from the node managers, they don’t have samza-yarn.jar (don’t know how to include it, since the run will be done in the resource manager).

Can you give me a tip on how to solve this?

Thanks in advance.

ps. the folder and tar.gz of the job are located in one machine alone, is that the right way to do it or do I need to replicate hello-samza in all machines to run it?
On 11 Aug 2014, at 23:12, Telles Nobrega <te...@gmail.com> wrote:

> What is your suggestion here, should I keep going on this quest to fix hdfs or should I try to run using HttpFileSystem?
> On 11 Aug 2014, at 23:01, Telles Nobrega <te...@gmail.com> wrote:
> 
>> The port is right?? 50700. I have no idea what is happening now.
>> 
>> On 11 Aug 2014, at 22:33, Telles Nobrega <te...@gmail.com> wrote:
>> 
>>> Right now the error is the following:
>>> Exception in thread "main" java.io.IOException: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Protocol message end-group tag did not match expected tag.; Host Details : local host is: "telles-samza-master/10.1.0.79"; destination host is: "telles-samza-master":50070;
>>> 	at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
>>> 	at org.apache.hadoop.ipc.Client.call(Client.java:1410)
>>> 	at org.apache.hadoop.ipc.Client.call(Client.java:1359)
>>> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>>> 	at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
>>> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> 	at java.lang.reflect.Method.invoke(Method.java:606)
>>> 	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>>> 	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>>> 	at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
>>> 	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:671)
>>> 	at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1746)
>>> 	at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1112)
>>> 	at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1108)
>>> 	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>>> 	at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1108)
>>> 	at org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.scala:111)
>>> 	at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
>>> 	at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
>>> 	at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
>>> 	at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
>>> 	at org.apache.samza.job.JobRunner.main(JobRunner.scala)
>>> Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol message end-group tag did not match expected tag.
>>> 	at com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:94)
>>> 	at com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
>>> 	at com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:202)
>>> 	at com.google.protobuf.AbstractParser.parsePartialDelimitedFrom(AbstractParser.java:241)
>>> 	at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:253)
>>> 	at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:259)
>>> 	at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:49)
>>> 	at org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcHeaderProtos.java:2364)
>>> 	at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1051)
>>> 	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:945)
>>> 
>>> I feel that I’m close to making it run. Thanks for the help in advance.
>>> On 11 Aug 2014, at 22:06, Telles Nobrega <te...@gmail.com> wrote:
>>> 
>>>> Hi, I downloaded hadoop-common-2.3.0.jar and it worked better. Now I’m having a configuration problem with my host, but it looks like the hdfs is not a problem anymore.
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On 11 Aug 2014, at 22:04, Telles Nobrega <te...@gmail.com> wrote:
>>>> 
>>>>> So, I added hadoop-hdfs-2.3.0.jar as a maven dependency. Recompiled the project, extracted to deploy/samza and there problem still happens. I downloaded hadoop-client-2.3.0.jar and the problems still happens, hadoop-common is 2.2.0 does this is a problem? I will try with 2.3.0
>>>>> 
>>>>> Actually a lot of hadoop jars are 2.2.0
>>>>> 
>>>>> On 11 Aug 2014, at 21:33, Yan Fang <ya...@gmail.com> wrote:
>>>>> 
>>>>>> <include>org.apache.hadoop:hadoop-hdfs</include>
>>>>> 
>>>> 
>>> 
>> 
> 


Re: Running Job on Multinode Yarn Cluster

Posted by Telles Nobrega <te...@gmail.com>.
What is your suggestion here, should I keep going on this quest to fix hdfs or should I try to run using HttpFileSystem?
On 11 Aug 2014, at 23:01, Telles Nobrega <te...@gmail.com> wrote:

> The port is right?? 50700. I have no idea what is happening now.
> 
> On 11 Aug 2014, at 22:33, Telles Nobrega <te...@gmail.com> wrote:
> 
>> Right now the error is the following:
>> Exception in thread "main" java.io.IOException: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Protocol message end-group tag did not match expected tag.; Host Details : local host is: "telles-samza-master/10.1.0.79"; destination host is: "telles-samza-master":50070;
>> 	at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
>> 	at org.apache.hadoop.ipc.Client.call(Client.java:1410)
>> 	at org.apache.hadoop.ipc.Client.call(Client.java:1359)
>> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>> 	at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
>> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> 	at java.lang.reflect.Method.invoke(Method.java:606)
>> 	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>> 	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>> 	at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
>> 	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:671)
>> 	at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1746)
>> 	at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1112)
>> 	at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1108)
>> 	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>> 	at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1108)
>> 	at org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.scala:111)
>> 	at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
>> 	at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
>> 	at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
>> 	at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
>> 	at org.apache.samza.job.JobRunner.main(JobRunner.scala)
>> Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol message end-group tag did not match expected tag.
>> 	at com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:94)
>> 	at com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
>> 	at com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:202)
>> 	at com.google.protobuf.AbstractParser.parsePartialDelimitedFrom(AbstractParser.java:241)
>> 	at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:253)
>> 	at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:259)
>> 	at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:49)
>> 	at org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcHeaderProtos.java:2364)
>> 	at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1051)
>> 	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:945)
>> 
>> I feel that I’m close to making it run. Thanks for the help in advance.
>> On 11 Aug 2014, at 22:06, Telles Nobrega <te...@gmail.com> wrote:
>> 
>>> Hi, I downloaded hadoop-common-2.3.0.jar and it worked better. Now I’m having a configuration problem with my host, but it looks like the hdfs is not a problem anymore.
>>> 
>>> 
>>> 
>>> 
>>> On 11 Aug 2014, at 22:04, Telles Nobrega <te...@gmail.com> wrote:
>>> 
>>>> So, I added hadoop-hdfs-2.3.0.jar as a maven dependency. Recompiled the project, extracted to deploy/samza and there problem still happens. I downloaded hadoop-client-2.3.0.jar and the problems still happens, hadoop-common is 2.2.0 does this is a problem? I will try with 2.3.0
>>>> 
>>>> Actually a lot of hadoop jars are 2.2.0
>>>> 
>>>> On 11 Aug 2014, at 21:33, Yan Fang <ya...@gmail.com> wrote:
>>>> 
>>>>> <include>org.apache.hadoop:hadoop-hdfs</include>
>>>> 
>>> 
>> 
> 


Re: Running Job on Multinode Yarn Cluster

Posted by Telles Nobrega <te...@gmail.com>.
The port is right?? 50700. I have no idea what is happening now.

On 11 Aug 2014, at 22:33, Telles Nobrega <te...@gmail.com> wrote:

> Right now the error is the following:
> Exception in thread "main" java.io.IOException: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Protocol message end-group tag did not match expected tag.; Host Details : local host is: "telles-samza-master/10.1.0.79"; destination host is: "telles-samza-master":50070;
> 	at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:1410)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:1359)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> 	at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:606)
> 	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
> 	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> 	at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
> 	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:671)
> 	at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1746)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1112)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1108)
> 	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1108)
> 	at org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.scala:111)
> 	at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
> 	at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
> 	at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
> 	at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
> 	at org.apache.samza.job.JobRunner.main(JobRunner.scala)
> Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol message end-group tag did not match expected tag.
> 	at com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:94)
> 	at com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
> 	at com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:202)
> 	at com.google.protobuf.AbstractParser.parsePartialDelimitedFrom(AbstractParser.java:241)
> 	at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:253)
> 	at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:259)
> 	at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:49)
> 	at org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcHeaderProtos.java:2364)
> 	at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1051)
> 	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:945)
> 
> I feel that I’m close to making it run. Thanks for the help in advance.
> On 11 Aug 2014, at 22:06, Telles Nobrega <te...@gmail.com> wrote:
> 
>> Hi, I downloaded hadoop-common-2.3.0.jar and it worked better. Now I’m having a configuration problem with my host, but it looks like the hdfs is not a problem anymore.
>> 
>> 
>> 
>> 
>> On 11 Aug 2014, at 22:04, Telles Nobrega <te...@gmail.com> wrote:
>> 
>>> So, I added hadoop-hdfs-2.3.0.jar as a maven dependency. Recompiled the project, extracted to deploy/samza and there problem still happens. I downloaded hadoop-client-2.3.0.jar and the problems still happens, hadoop-common is 2.2.0 does this is a problem? I will try with 2.3.0
>>> 
>>> Actually a lot of hadoop jars are 2.2.0
>>> 
>>> On 11 Aug 2014, at 21:33, Yan Fang <ya...@gmail.com> wrote:
>>> 
>>>> <include>org.apache.hadoop:hadoop-hdfs</include>
>>> 
>> 
> 


Re: Running Job on Multinode Yarn Cluster

Posted by Telles Nobrega <te...@gmail.com>.
Right now the error is the following:
Exception in thread "main" java.io.IOException: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Protocol message end-group tag did not match expected tag.; Host Details : local host is: "telles-samza-master/10.1.0.79"; destination host is: "telles-samza-master":50070;
	at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
	at org.apache.hadoop.ipc.Client.call(Client.java:1410)
	at org.apache.hadoop.ipc.Client.call(Client.java:1359)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
	at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
	at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:671)
	at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1746)
	at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1112)
	at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1108)
	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
	at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1108)
	at org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.scala:111)
	at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
	at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
	at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
	at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
	at org.apache.samza.job.JobRunner.main(JobRunner.scala)
Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol message end-group tag did not match expected tag.
	at com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:94)
	at com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
	at com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:202)
	at com.google.protobuf.AbstractParser.parsePartialDelimitedFrom(AbstractParser.java:241)
	at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:253)
	at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:259)
	at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:49)
	at org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcHeaderProtos.java:2364)
	at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1051)
	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:945)

I feel that I’m close to making it run. Thanks for the help in advance.
On 11 Aug 2014, at 22:06, Telles Nobrega <te...@gmail.com> wrote:

> Hi, I downloaded hadoop-common-2.3.0.jar and it worked better. Now I’m having a configuration problem with my host, but it looks like the hdfs is not a problem anymore.
> 
> 
> 
> 
> On 11 Aug 2014, at 22:04, Telles Nobrega <te...@gmail.com> wrote:
> 
>> So, I added hadoop-hdfs-2.3.0.jar as a maven dependency. Recompiled the project, extracted to deploy/samza and there problem still happens. I downloaded hadoop-client-2.3.0.jar and the problems still happens, hadoop-common is 2.2.0 does this is a problem? I will try with 2.3.0
>> 
>> Actually a lot of hadoop jars are 2.2.0
>> 
>> On 11 Aug 2014, at 21:33, Yan Fang <ya...@gmail.com> wrote:
>> 
>>> <include>org.apache.hadoop:hadoop-hdfs</include>
>> 
> 


Re: Running Job on Multinode Yarn Cluster

Posted by Telles Nobrega <te...@gmail.com>.
Hi, I downloaded hadoop-common-2.3.0.jar and it worked better. Now I’m having a configuration problem with my host, but it looks like the hdfs is not a problem anymore.




On 11 Aug 2014, at 22:04, Telles Nobrega <te...@gmail.com> wrote:

> So, I added hadoop-hdfs-2.3.0.jar as a maven dependency. Recompiled the project, extracted to deploy/samza and there problem still happens. I downloaded hadoop-client-2.3.0.jar and the problems still happens, hadoop-common is 2.2.0 does this is a problem? I will try with 2.3.0
> 
> Actually a lot of hadoop jars are 2.2.0
> 
> On 11 Aug 2014, at 21:33, Yan Fang <ya...@gmail.com> wrote:
> 
>> <include>org.apache.hadoop:hadoop-hdfs</include>
> 


Re: Running Job on Multinode Yarn Cluster

Posted by Telles Nobrega <te...@gmail.com>.
So, I added hadoop-hdfs-2.3.0.jar as a maven dependency. Recompiled the project, extracted to deploy/samza and there problem still happens. I downloaded hadoop-client-2.3.0.jar and the problems still happens, hadoop-common is 2.2.0 does this is a problem? I will try with 2.3.0

Actually a lot of hadoop jars are 2.2.0

On 11 Aug 2014, at 21:33, Yan Fang <ya...@gmail.com> wrote:

> <include>org.apache.hadoop:hadoop-hdfs</include>


Re: Running Job on Multinode Yarn Cluster

Posted by Yan Fang <ya...@gmail.com>.
Opps, it seems that the hadoop-hdfs-2.3.0.jar lost some dependencies. It's
a hadoop-specific exception.

* Could you add the jar file that contains the missing class? (not quite
sure it's in hadoop-common or hadoop-client)

* Another worth trying is to add the dependency from pom file, let maven
take care of it. Assume you are using hello-samza similar structure. There
are two steps:
  ** add dependency in pom.xml of samza-job-package
  ** add <include>org.apache.hadoop:hadoop-hdfs</include> in
samza-job-package/src/main/assembly/src.xml
Then mvn clean package.

Thank you.

Cheers,

Fang, Yan
yanfang724@gmail.com
+1 (206) 849-4108


On Mon, Aug 11, 2014 at 5:12 PM, Telles Nobrega <te...@gmail.com>
wrote:

> Same thing happens.
>
> On 11 Aug 2014, at 21:05, Yan Fang <ya...@gmail.com> wrote:
>
> > Cool, we are almost there. Could you remove
> >
> > <property>
> >  <name>fs.hdfs.impl</name>
> >  <value>org.apache.hadoop.hdfs.
> > DistributedFileSystem</value>
> >  <description>The FileSystem for hdfs: uris.</description>
> > </property>
> >
> > To see how it works?
> >
> >
> > Fang, Yan
> > yanfang724@gmail.com
> > +1 (206) 849-4108
> >
> >
> > On Mon, Aug 11, 2014 at 5:03 PM, Telles Nobrega <tellesnobrega@gmail.com
> >
> > wrote:
> >
> >> You may forget this last email, I was really stupid and put the files
> in a
> >> different folder. Now it could find the file but it’s not there yet…
> >> another error came up
> >>
> >> Exception in thread "main" java.util.ServiceConfigurationError:
> >> org.apache.hadoop.fs.FileSystem: Provider
> >> org.apache.hadoop.hdfs.DistributedFileSystem could not be instantiated
> >>        at java.util.ServiceLoader.fail(ServiceLoader.java:224)
> >>        at java.util.ServiceLoader.access$100(ServiceLoader.java:181)
> >>        at
> >> java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:377)
> >>        at java.util.ServiceLoader$1.next(ServiceLoader.java:445)
> >>        at
> >> org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2400)
> >>        at
> >> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2411)
> >>        at
> >> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428)
> >>        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
> >>        at
> >> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
> >>        at
> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
> >>        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
> >>        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
> >>        at
> >>
> org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.scala:111)
> >>        at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
> >>        at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
> >>        at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
> >>        at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
> >>        at org.apache.samza.job.JobRunner.main(JobRunner.scala)
> >> Caused by: java.lang.NoClassDefFoundError:
> >> org/apache/hadoop/conf/Configuration$DeprecationDelta
> >>        at
> >>
> org.apache.hadoop.hdfs.HdfsConfiguration.addDeprecatedKeys(HdfsConfiguration.java:66)
> >>        at
> >>
> org.apache.hadoop.hdfs.HdfsConfiguration.<clinit>(HdfsConfiguration.java:31)
> >>        at
> >>
> org.apache.hadoop.hdfs.DistributedFileSystem.<clinit>(DistributedFileSystem.java:106)
> >>        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> >> Method)
> >>        at
> >>
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> >>        at
> >>
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> >>        at
> java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> >>        at java.lang.Class.newInstance(Class.java:374)
> >>        at
> >> java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:373)
> >>        ... 15 more
> >> Caused by: java.lang.ClassNotFoundException:
> >> org.apache.hadoop.conf.Configuration$DeprecationDelta
> >>        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> >>        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> >>        at java.security.AccessController.doPrivileged(Native Method)
> >>        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> >>        at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> >>        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> >>        at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> >>        ... 24 more
> >>
> >> On 11 Aug 2014, at 20:45, Telles Nobrega <te...@gmail.com>
> wrote:
> >>
> >>> Hi, I copied hadoop-hdfs-2.3.0 to my-job/lib and it changed the error
> >> which is good but the error is back to
> >>>
> >>> Exception in thread "main" java.lang.RuntimeException:
> >> java.lang.ClassNotFoundException: Class
> >> org.apache.hadoop.hdfs.DistributedFileSystem not found
> >>>      at
> >> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1720)
> >>>      at
> >> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2415)
> >>>      at
> >> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428)
> >>>      at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
> >>>      at
> >> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
> >>>      at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
> >>>      at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
> >>>      at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
> >>>      at
> >>
> org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.scala:111)
> >>>      at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
> >>>      at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
> >>>      at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
> >>>      at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
> >>>      at org.apache.samza.job.JobRunner.main(JobRunner.scala)
> >>> Caused by: java.lang.ClassNotFoundException: Class
> >> org.apache.hadoop.hdfs.DistributedFileSystem not found
> >>>      at
> >>
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1626)
> >>>      at
> >> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1718)
> >>>      ... 13 more
> >>>
> >>> Do I need to have this lib in all nodes at the job folder or just to
> >> submit?
> >>>
> >>> On 11 Aug 2014, at 20:11, Yan Fang <ya...@gmail.com> wrote:
> >>>
> >>>> Hi Telles,
> >>>>
> >>>> I replayed your problem and think I figured out why CLASSPATH does not
> >>>> work. Because in our script bin/run-class.sh, we have the line
> >>>> "CLASSPATH=$HADOOP_CONF_DIR", which actually ingores your setting.
> >>>>
> >>>> So a simple solution is to copy the hadoop-hdfs.jar to your samza lib
> >>>> directory. Then run bin/run-job ----config-factory=...
> >> --config-path=... .
> >>>> Let me know how it goes. Thank you.
> >>>>
> >>>> Cheers,
> >>>>
> >>>> Fang, Yan
> >>>> yanfang724@gmail.com
> >>>> +1 (206) 849-4108
> >>>>
> >>>>
> >>>> On Mon, Aug 11, 2014 at 4:07 PM, Telles Nobrega <
> >> tellesnobrega@gmail.com>
> >>>> wrote:
> >>>>
> >>>>> Sure, thanks.
> >>>>>
> >>>>>
> >>>>> On Mon, Aug 11, 2014 at 6:22 PM, Yan Fang <ya...@gmail.com>
> >> wrote:
> >>>>>
> >>>>>> Hi Telles,
> >>>>>>
> >>>>>> I am not sure whether exporting the CLASSPATH works. (sometimes it
> >> does
> >>>>> not
> >>>>>> work for me...) My suggestion is to include the hdfs jar explicitly
> in
> >>>>> the
> >>>>>> package that you upload to hdfs. Also , remember to put the jar into
> >> your
> >>>>>> local samza (which is deploy/samza/lib if you go with the
> hello-samza
> >>>>>> tutorial) Let me know if that works.
> >>>>>>
> >>>>>> Cheers,
> >>>>>>
> >>>>>> Fang, Yan
> >>>>>> yanfang724@gmail.com
> >>>>>> +1 (206) 849-4108
> >>>>>>
> >>>>>>
> >>>>>> On Mon, Aug 11, 2014 at 2:04 PM, Chris Riccomini <
> >>>>>> criccomini@linkedin.com.invalid> wrote:
> >>>>>>
> >>>>>>> Hey Telles,
> >>>>>>>
> >>>>>>> Hmm. I'm out of ideas. If Zhijie is around, he'd probably be of
> use,
> >>>>> but
> >>>>>> I
> >>>>>>> haven't heard from him in a while.
> >>>>>>>
> >>>>>>> I'm afraid your best bet is probably to email the YARN dev mailing
> >>>>> list,
> >>>>>>> since this is a YARN config issue.
> >>>>>>>
> >>>>>>> Cheers,
> >>>>>>> Chris
> >>>>>>>
> >>>>>>> On 8/11/14 1:58 PM, "Telles Nobrega" <te...@gmail.com>
> >> wrote:
> >>>>>>>
> >>>>>>>> ​I exported ​export
> >>>>>>>
> >>>>>>
> >>>>>>
> >>
> CLASSPATH=$CLASSPATH:hadoop-2.3.0/share/hadoop/hdfs/hadoop-hdfs-2.3.0.jar
> >>>>>>>> and still happened the same problem.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Mon, Aug 11, 2014 at 5:35 PM, Chris Riccomini <
> >>>>>>>> criccomini@linkedin.com.invalid> wrote:
> >>>>>>>>
> >>>>>>>>> Hey Telles,
> >>>>>>>>>
> >>>>>>>>> It sounds like either the HDFS jar is missing from the classpath,
> >> or
> >>>>>> the
> >>>>>>>>> hdfs file system needs to be configured:
> >>>>>>>>>
> >>>>>>>>> <property>
> >>>>>>>>> <name>fs.hdfs.impl</name>
> >>>>>>>>> <value>org.apache.hadoop.hdfs.DistributedFileSystem</value>
> >>>>>>>>> <description>The FileSystem for hdfs: uris.</description>
> >>>>>>>>> </property>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> (from
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>
> https://groups.google.com/a/cloudera.org/forum/#!topic/scm-users/lyho8ptA
> >>>>>>>>> zE
> >>>>>>>>> 0)
> >>>>>>>>>
> >>>>>>>>> I believe this will need to be configured for your NM.
> >>>>>>>>>
> >>>>>>>>> Cheers,
> >>>>>>>>> Chris
> >>>>>>>>>
> >>>>>>>>> On 8/11/14 1:31 PM, "Telles Nobrega" <te...@gmail.com>
> >>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Yes, it is like this:
> >>>>>>>>>>
> >>>>>>>>>> <configuration>
> >>>>>>>>>> <property>
> >>>>>>>>>> <name>dfs.datanode.data.dir</name>
> >>>>>>>>>> <value>file:///home/ubuntu/hadoop-2.3.0/hdfs/datanode</value>
> >>>>>>>>>> <description>Comma separated list of paths on the local
> >>>>>> filesystem
> >>>>>>>>> of
> >>>>>>>>>> a
> >>>>>>>>>> DataNode where it should store its blocks.</description>
> >>>>>>>>>> </property>
> >>>>>>>>>>
> >>>>>>>>>> <property>
> >>>>>>>>>> <name>dfs.namenode.name.dir</name>
> >>>>>>>>>> <value>file:///home/ubuntu/hadoop-2.3.0/hdfs/namenode</value>
> >>>>>>>>>> <description>Path on the local filesystem where the NameNode
> >>>>>> stores
> >>>>>>>>>> the
> >>>>>>>>>> namespace and transaction logs persistently.</description>
> >>>>>>>>>> </property>
> >>>>>>>>>> </configuration>
> >>>>>>>>>> ~
> >>>>>>>>>>
> >>>>>>>>>> I saw some report that this may be a classpath problem. Does
> this
> >>>>>>>>> sounds
> >>>>>>>>>> right to you?
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Mon, Aug 11, 2014 at 5:25 PM, Yan Fang <yanfang724@gmail.com
> >
> >>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Hi Telles,
> >>>>>>>>>>>
> >>>>>>>>>>> It looks correct. Did you put the hdfs-site.xml into your
> >>>>>>>>>>> HADOOP_CONF_DIR
> >>>>>>>>>>> ?(such as ~/.samza/conf)
> >>>>>>>>>>>
> >>>>>>>>>>> Fang, Yan
> >>>>>>>>>>> yanfang724@gmail.com
> >>>>>>>>>>> +1 (206) 849-4108
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Mon, Aug 11, 2014 at 1:02 PM, Telles Nobrega
> >>>>>>>>>>> <te...@gmail.com>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> ​Hi Yan Fang,
> >>>>>>>>>>>>
> >>>>>>>>>>>> I was able to deploy the file to hdfs, I can see them in all
> my
> >>>>>>>>> nodes
> >>>>>>>>>>> but
> >>>>>>>>>>>> when I tried running I got this error:
> >>>>>>>>>>>>
> >>>>>>>>>>>> Exception in thread "main" java.io.IOException: No FileSystem
> >>>>> for
> >>>>>>>>>>> scheme:
> >>>>>>>>>>>> hdfs
> >>>>>>>>>>>> at
> >>>>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>
> >> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2421)
> >>>>>>>>>>>> at
> >>>>>>>>>>>
> >>>>>>>
> >>>>>>>
> >> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428)
> >>>>>>>>>>>> at
> >>>>> org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
> >>>>>>>>>>>> at
> >>>>>>>>>>>
> >>>>>>>
> >>>>>>>
> >> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
> >>>>>>>>>>>> at
> >>>>>> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
> >>>>>>>>>>>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
> >>>>>>>>>>>> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
> >>>>>>>>>>>> at
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>
> >> org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.s
> >>>>>>>>>>> ca
> >>>>>>>>>>> la:111)
> >>>>>>>>>>>> at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
> >>>>>>>>>>>> at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
> >>>>>>>>>>>> at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
> >>>>>>>>>>>> at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
> >>>>>>>>>>>> at org.apache.samza.job.JobRunner.main(JobRunner.scala)
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> This is my yarn.package.path config:
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>
> >> ​yarn.package.path=hdfs://telles-master-samza:50070/samza-job-package-0
> >>>>>>>>>>> .7
> >>>>>>>>>>> .0-dist.tar.gz
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thanks in advance
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Mon, Aug 11, 2014 at 3:00 PM, Yan Fang <
> >>>>> yanfang724@gmail.com>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Hi Telles,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> In terms of "*I tried pushing the tar file to HDFS but I got
> >>>>> an
> >>>>>>>>>>> error
> >>>>>>>>>>>> from
> >>>>>>>>>>>>> hadoop saying that it couldn’t find core-site.xml file*.", I
> >>>>>>>>> guess
> >>>>>>>>>>> you
> >>>>>>>>>>>> set
> >>>>>>>>>>>>> the HADOOP_CONF_DIR variable and made it point to
> >>>>>> ~/.samza/conf.
> >>>>>>>>> You
> >>>>>>>>>>> can
> >>>>>>>>>>>> do
> >>>>>>>>>>>>> 1) make the HADOOP_CONF_DIR point to the directory where your
> >>>>>>>>> conf
> >>>>>>>>>>> files
> >>>>>>>>>>>>> are, such as /etc/hadoop/conf. Or 2) copy the config files to
> >>>>>>>>>>>>> ~/.samza/conf. Thank you,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Cheer,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Fang, Yan
> >>>>>>>>>>>>> yanfang724@gmail.com
> >>>>>>>>>>>>> +1 (206) 849-4108
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Mon, Aug 11, 2014 at 7:40 AM, Chris Riccomini <
> >>>>>>>>>>>>> criccomini@linkedin.com.invalid> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Hey Telles,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> To get YARN working with the HTTP file system, you need to
> >>>>>>>>> follow
> >>>>>>>>>>> the
> >>>>>>>>>>>>>> instructions on:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>
> http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-node
> >>>>>>>>>>> -y
> >>>>>>>>>>>>>> arn.html
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> In the "Set Up Http Filesystem for YARN" section.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> You shouldn't need to compile anything (no Gradle, which is
> >>>>>>>>> what
> >>>>>>>>>>> your
> >>>>>>>>>>>>>> stack trace is showing). This setup should be done for all
> >>>>> of
> >>>>>>>>> the
> >>>>>>>>>>> NMs,
> >>>>>>>>>>>>>> since they will be the ones downloading your job's package
> >>>>>>>>> (from
> >>>>>>>>>>>>>> yarn.package.path).
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>> Chris
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On 8/9/14 9:44 PM, "Telles Nobrega" <
> >>>>> tellesnobrega@gmail.com
> >>>>>>>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Hi again, I tried installing the scala libs but the Http
> >>>>>>>>> problem
> >>>>>>>>>>> still
> >>>>>>>>>>>>>>> occurs. I realised that I need to compile incubator samza
> >>>>> in
> >>>>>>>>> the
> >>>>>>>>>>>>> machines
> >>>>>>>>>>>>>>> that I¹m going to run the jobs, but the compilation fails
> >>>>>> with
> >>>>>>>>>>> this
> >>>>>>>>>>>> huge
> >>>>>>>>>>>>>>> message:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> #
> >>>>>>>>>>>>>>> # There is insufficient memory for the Java Runtime
> >>>>>>>>> Environment
> >>>>>>>>>>> to
> >>>>>>>>>>>>>>> continue.
> >>>>>>>>>>>>>>> # Native memory allocation (malloc) failed to allocate
> >>>>>>>>> 3946053632
> >>>>>>>>>>>> bytes
> >>>>>>>>>>>>>>> for committing reserved memory.
> >>>>>>>>>>>>>>> # An error report file with more information is saved as:
> >>>>>>>>>>>>>>> #
> >>>>>> /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2506.log
> >>>>>>>>>>>>>>> Could not write standard input into: Gradle Worker 13.
> >>>>>>>>>>>>>>> java.io.IOException: Broken pipe
> >>>>>>>>>>>>>>>    at java.io.FileOutputStream.writeBytes(Native
> >>>>> Method)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>> java.io.FileOutputStream.write(FileOutputStream.java:345)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>
> >>>>>>>
> >> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>> java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOut
> >>>>>>>>>>>> pu
> >>>>>>>>>>>> tH
> >>>>>>>>>>>>>>> andleRunner.java:53)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecuto
> >>>>>>>>>>>> rI
> >>>>>>>>>>>> mp
> >>>>>>>>>>>>>>> l$1.run(DefaultExecutorFactory.java:66)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
> >>>>>>>>>>>> av
> >>>>>>>>>>>> a:
> >>>>>>>>>>>>>>> 1145)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> >>>>>>>>>>>> ja
> >>>>>>>>>>>> va
> >>>>>>>>>>>>>>> :615)
> >>>>>>>>>>>>>>>    at java.lang.Thread.run(Thread.java:744)
> >>>>>>>>>>>>>>> Process 'Gradle Worker 13' finished with non-zero exit
> >>>>>> value 1
> >>>>>>>>>>>>>>> org.gradle.process.internal.ExecException: Process 'Gradle
> >>>>>>>>> Worker
> >>>>>>>>>>> 13'
> >>>>>>>>>>>>>>> finished with non-zero exit value 1
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNor
> >>>>>>>>>>>> ma
> >>>>>>>>>>>> lE
> >>>>>>>>>>>>>>> xitValue(DefaultExecHandle.java:362)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(Default
> >>>>>>>>>>>> Wo
> >>>>>>>>>>>> rk
> >>>>>>>>>>>>>>> erProcess.java:89)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWor
> >>>>>>>>>>>> ke
> >>>>>>>>>>>> rP
> >>>>>>>>>>>>>>> rocess.java:33)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(D
> >>>>>>>>>>>> ef
> >>>>>>>>>>>> au
> >>>>>>>>>>>>>>> ltWorkerProcess.java:55)
> >>>>>>>>>>>>>>>    at
> >>>>>> sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> >>>>>>>>>>> Method)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
> >>>>>>>>>>>> av
> >>>>>>>>>>>> a:
> >>>>>>>>>>>>>>> 57)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> >>>>>>>>>>>> or
> >>>>>>>>>>>> Im
> >>>>>>>>>>>>>>> pl.java:43)
> >>>>>>>>>>>>>>>    at java.lang.reflect.Method.invoke(Method.java:606)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDi
> >>>>>>>>>>>> sp
> >>>>>>>>>>>> at
> >>>>>>>>>>>>>>> ch.java:35)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDi
> >>>>>>>>>>>> sp
> >>>>>>>>>>>> at
> >>>>>>>>>>>>>>> ch.java:24)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:
> >>>>>>>>>>>> 81
> >>>>>>>>>>>> )
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:
> >>>>>>>>>>>> 30
> >>>>>>>>>>>> )
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocati
> >>>>>>>>>>>> on
> >>>>>>>>>>>> Ha
> >>>>>>>>>>>>>>> ndler.invoke(ProxyDispatchAdapter.java:93)
> >>>>>>>>>>>>>>>    at com.sun.proxy.$Proxy46.executionFinished(Unknown
> >>>>>>>>>>> Source)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultE
> >>>>>>>>>>>> xe
> >>>>>>>>>>>> cH
> >>>>>>>>>>>>>>> andle.java:212)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHand
> >>>>>>>>>>>> le
> >>>>>>>>>>>> .j
> >>>>>>>>>>>>>>> ava:309)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunne
> >>>>>>>>>>>> r.
> >>>>>>>>>>>> ja
> >>>>>>>>>>>>>>> va:108)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java
> >>>>>>>>>>>> :8
> >>>>>>>>>>>> 8)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecuto
> >>>>>>>>>>>> rI
> >>>>>>>>>>>> mp
> >>>>>>>>>>>>>>> l$1.run(DefaultExecutorFactory.java:66)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
> >>>>>>>>>>>> av
> >>>>>>>>>>>> a:
> >>>>>>>>>>>>>>> 1145)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> >>>>>>>>>>>> ja
> >>>>>>>>>>>> va
> >>>>>>>>>>>>>>> :615)
> >>>>>>>>>>>>>>>    at java.lang.Thread.run(Thread.java:744)
> >>>>>>>>>>>>>>> OpenJDK 64-Bit Server VM warning: INFO:
> >>>>>>>>>>>>>>> os::commit_memory(0x000000070a6c0000, 3946053632, 0)
> >>>>> failed;
> >>>>>>>>>>>>>>> error='Cannot allocate memory' (errno=12)
> >>>>>>>>>>>>>>> #
> >>>>>>>>>>>>>>> # There is insufficient memory for the Java Runtime
> >>>>>>>>> Environment
> >>>>>>>>>>> to
> >>>>>>>>>>>>>>> continue.
> >>>>>>>>>>>>>>> # Native memory allocation (malloc) failed to allocate
> >>>>>>>>> 3946053632
> >>>>>>>>>>>> bytes
> >>>>>>>>>>>>>>> for committing reserved memory.
> >>>>>>>>>>>>>>> # An error report file with more information is saved as:
> >>>>>>>>>>>>>>> #
> >>>>>> /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2518.log
> >>>>>>>>>>>>>>> Could not write standard input into: Gradle Worker 14.
> >>>>>>>>>>>>>>> java.io.IOException: Broken pipe
> >>>>>>>>>>>>>>>    at java.io.FileOutputStream.writeBytes(Native
> >>>>> Method)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>> java.io.FileOutputStream.write(FileOutputStream.java:345)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>
> >>>>>>>
> >> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>> java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOut
> >>>>>>>>>>>> pu
> >>>>>>>>>>>> tH
> >>>>>>>>>>>>>>> andleRunner.java:53)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecuto
> >>>>>>>>>>>> rI
> >>>>>>>>>>>> mp
> >>>>>>>>>>>>>>> l$1.run(DefaultExecutorFactory.java:66)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
> >>>>>>>>>>>> av
> >>>>>>>>>>>> a:
> >>>>>>>>>>>>>>> 1145)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> >>>>>>>>>>>> ja
> >>>>>>>>>>>> va
> >>>>>>>>>>>>>>> :615)
> >>>>>>>>>>>>>>>    at java.lang.Thread.run(Thread.java:744)
> >>>>>>>>>>>>>>> Process 'Gradle Worker 14' finished with non-zero exit
> >>>>>> value 1
> >>>>>>>>>>>>>>> org.gradle.process.internal.ExecException: Process 'Gradle
> >>>>>>>>> Worker
> >>>>>>>>>>> 14'
> >>>>>>>>>>>>>>> finished with non-zero exit value 1
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNor
> >>>>>>>>>>>> ma
> >>>>>>>>>>>> lE
> >>>>>>>>>>>>>>> xitValue(DefaultExecHandle.java:362)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(Default
> >>>>>>>>>>>> Wo
> >>>>>>>>>>>> rk
> >>>>>>>>>>>>>>> erProcess.java:89)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWor
> >>>>>>>>>>>> ke
> >>>>>>>>>>>> rP
> >>>>>>>>>>>>>>> rocess.java:33)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(D
> >>>>>>>>>>>> ef
> >>>>>>>>>>>> au
> >>>>>>>>>>>>>>> ltWorkerProcess.java:55)
> >>>>>>>>>>>>>>>    at
> >>>>>> sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> >>>>>>>>>>> Method)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
> >>>>>>>>>>>> av
> >>>>>>>>>>>> a:
> >>>>>>>>>>>>>>> 57)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> >>>>>>>>>>>> or
> >>>>>>>>>>>> Im
> >>>>>>>>>>>>>>> pl.java:43)
> >>>>>>>>>>>>>>>    at java.lang.reflect.Method.invoke(Method.java:606)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDi
> >>>>>>>>>>>> sp
> >>>>>>>>>>>> at
> >>>>>>>>>>>>>>> ch.java:35)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDi
> >>>>>>>>>>>> sp
> >>>>>>>>>>>> at
> >>>>>>>>>>>>>>> ch.java:24)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:
> >>>>>>>>>>>> 81
> >>>>>>>>>>>> )
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:
> >>>>>>>>>>>> 30
> >>>>>>>>>>>> )
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocati
> >>>>>>>>>>>> on
> >>>>>>>>>>>> Ha
> >>>>>>>>>>>>>>> ndler.invoke(ProxyDispatchAdapter.java:93)
> >>>>>>>>>>>>>>>    at com.sun.proxy.$Proxy46.executionFinished(Unknown
> >>>>>>>>>>> Source)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultE
> >>>>>>>>>>>> xe
> >>>>>>>>>>>> cH
> >>>>>>>>>>>>>>> andle.java:212)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHand
> >>>>>>>>>>>> le
> >>>>>>>>>>>> .j
> >>>>>>>>>>>>>>> ava:309)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunne
> >>>>>>>>>>>> r.
> >>>>>>>>>>>> ja
> >>>>>>>>>>>>>>> va:108)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java
> >>>>>>>>>>>> :8
> >>>>>>>>>>>> 8)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecuto
> >>>>>>>>>>>> rI
> >>>>>>>>>>>> mp
> >>>>>>>>>>>>>>> l$1.run(DefaultExecutorFactory.java:66)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
> >>>>>>>>>>>> av
> >>>>>>>>>>>> a:
> >>>>>>>>>>>>>>> 1145)
> >>>>>>>>>>>>>>>    at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>
> >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> >>>>>>>>>>>> ja
> >>>>>>>>>>>> va
> >>>>>>>>>>>>>>> :615)
> >>>>>>>>>>>>>>>    at java.lang.Thread.r
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Do I need more memory for my machines? Each already has
> >>>>>> 4GB. I
> >>>>>>>>>>> really
> >>>>>>>>>>>>>>> need to have this running. I¹m not sure which way is best
> >>>>>>>>> http or
> >>>>>>>>>>> hdfs
> >>>>>>>>>>>>>>> which one you suggest and how can i solve my problem for
> >>>>>> each
> >>>>>>>>>>> case.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thanks in advance and sorry for bothering this much.
> >>>>>>>>>>>>>>> On 10 Aug 2014, at 00:20, Telles Nobrega
> >>>>>>>>>>> <te...@gmail.com>
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Hi Chris, now I have the tar file in my RM machine, and
> >>>>>> the
> >>>>>>>>>>> yarn
> >>>>>>>>>>>> path
> >>>>>>>>>>>>>>>> points to it. I changed the core-site.xml to use
> >>>>>>>>> HttpFileSystem
> >>>>>>>>>>>> instead
> >>>>>>>>>>>>>>>> of HDFS now it is failing with
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Application application_1407640485281_0001 failed 2
> >>>>> times
> >>>>>>>>> due
> >>>>>>>>>>> to
> >>>>>>>>>>> AM
> >>>>>>>>>>>>>>>> Container for appattempt_1407640485281_0001_000002 exited
> >>>>>>>>> with
> >>>>>>>>>>>>>>>> exitCode:-1000 due to: java.lang.ClassNotFoundException:
> >>>>>>>>> Class
> >>>>>>>>>>>>>>>> org.apache.samza.util.hadoop.HttpFileSystem not found
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> I think I can solve this just installing scala files
> >>>>> from
> >>>>>>>>> the
> >>>>>>>>>>> samza
> >>>>>>>>>>>>>>>> tutorial, can you confirm that?
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On 09 Aug 2014, at 08:34, Telles Nobrega
> >>>>>>>>>>> <tellesnobrega@gmail.com
> >>>>>>>>>>>>
> >>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Hi Chris,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> I think the problem is that I forgot to update the
> >>>>>>>>>>>> yarn.job.package.
> >>>>>>>>>>>>>>>>> I will try again to see if it works now.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> I have one more question, how can I stop (command line)
> >>>>>> the
> >>>>>>>>>>> jobs
> >>>>>>>>>>>>>>>>> running in my topology, for the experiment that I will
> >>>>>> run,
> >>>>>>>>> I
> >>>>>>>>>>> need
> >>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>> run the same job in 4 minutes intervals. So I need to
> >>>>> kill
> >>>>>>>>> it,
> >>>>>>>>>>> clean
> >>>>>>>>>>>>>>>>> the kafka topics and rerun.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Thanks in advance.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On 08 Aug 2014, at 12:41, Chris Riccomini
> >>>>>>>>>>>>>>>>> <cr...@linkedin.com.INVALID> wrote:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Hey Telles,
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Do I need to have the job folder on each machine in
> >>>>> my
> >>>>>>>>>>> cluster?
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> No, you should not need to do this. There are two ways
> >>>>>> to
> >>>>>>>>>>> deploy
> >>>>>>>>>>>>> your
> >>>>>>>>>>>>>>>>>> tarball to the YARN grid. One is to put it in HDFS,
> >>>>> and
> >>>>>>>>> the
> >>>>>>>>>>> other
> >>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>> put it on an HTTP server. The link to running a Samza
> >>>>>> job
> >>>>>>>>> in
> >>>>>>>>>>> a
> >>>>>>>>>>>>>>>>>> multi-node
> >>>>>>>>>>>>>>>>>> YARN cluster describes how to do both (either HTTP
> >>>>>> server
> >>>>>>>>> or
> >>>>>>>>>>>> HDFS).
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> In both cases, once the tarball is put in on the
> >>>>>> HTTP/HDFS
> >>>>>>>>>>>>> server(s),
> >>>>>>>>>>>>>>>>>> you
> >>>>>>>>>>>>>>>>>> must update yarn.package.path to point to it. From
> >>>>>> there,
> >>>>>>>>> the
> >>>>>>>>>>> YARN
> >>>>>>>>>>>>> NM
> >>>>>>>>>>>>>>>>>> should download it for you automatically when you
> >>>>> start
> >>>>>>>>> your
> >>>>>>>>>>> job.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> * Can you send along a paste of your job config?
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>>>>>> Chris
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> On 8/8/14 8:04 AM, "Claudio Martins"
> >>>>>>>>>>> <cl...@mobileaware.com>
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Hi Telles, it looks to me that you forgot to update
> >>>>> the
> >>>>>>>>>>>>>>>>>>> "yarn.package.path"
> >>>>>>>>>>>>>>>>>>> attribute in your config file for the task.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> - Claudio Martins
> >>>>>>>>>>>>>>>>>>> Head of Engineering
> >>>>>>>>>>>>>>>>>>> MobileAware USA Inc. / www.mobileaware.com
> >>>>>>>>>>>>>>>>>>> office: +1 617 986 5060 / mobile: +1 617 480 5288
> >>>>>>>>>>>>>>>>>>> linkedin: www.linkedin.com/in/martinsclaudio
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> On Fri, Aug 8, 2014 at 10:55 AM, Telles Nobrega
> >>>>>>>>>>>>>>>>>>> <te...@gmail.com>
> >>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> this is my first time trying to run a job on a
> >>>>>> multinode
> >>>>>>>>>>>>>>>>>>>> environment. I
> >>>>>>>>>>>>>>>>>>>> have the cluster set up, I can see in the GUI that
> >>>>> all
> >>>>>>>>>>> nodes
> >>>>>>>>>>> are
> >>>>>>>>>>>>>>>>>>>> working.
> >>>>>>>>>>>>>>>>>>>> Do I need to have the job folder on each machine in
> >>>>> my
> >>>>>>>>>>> cluster?
> >>>>>>>>>>>>>>>>>>>> - The first time I tried running with the job on the
> >>>>>>>>>>> namenode
> >>>>>>>>>>>>>>>>>>>> machine
> >>>>>>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>> it failed saying:
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Application application_1407509228798_0001 failed 2
> >>>>>>>>> times
> >>>>>>>>>>> due
> >>>>>>>>>>> to
> >>>>>>>>>>>>> AM
> >>>>>>>>>>>>>>>>>>>> Container for appattempt_1407509228798_0001_000002
> >>>>>>>>> exited
> >>>>>>>>>>> with
> >>>>>>>>>>>>>>>>>>>> exitCode:
> >>>>>>>>>>>>>>>>>>>> -1000 due to: File
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>>>>>>
> >> file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-
> >>>>>>>>>>>>>>>>> pa
> >>>>>>>>>>>>>>>>> ck
> >>>>>>>>>>>>>>>>>>>> age-
> >>>>>>>>>>>>>>>>>>>> 0.7.0-dist.tar.gz
> >>>>>>>>>>>>>>>>>>>> does not exist
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> So I copied the folder to each machine in my cluster
> >>>>>> and
> >>>>>>>>>>> got
> >>>>>>>>>>>> this
> >>>>>>>>>>>>>>>>>>>> error:
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Application application_1407509228798_0002 failed 2
> >>>>>>>>> times
> >>>>>>>>>>> due
> >>>>>>>>>>> to
> >>>>>>>>>>>>> AM
> >>>>>>>>>>>>>>>>>>>> Container for appattempt_1407509228798_0002_000002
> >>>>>>>>> exited
> >>>>>>>>>>> with
> >>>>>>>>>>>>>>>>>>>> exitCode:
> >>>>>>>>>>>>>>>>>>>> -1000 due to: Resource
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>>>>>>>>>>
> >> file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-
> >>>>>>>>>>>>>>>>> pa
> >>>>>>>>>>>>>>>>> ck
> >>>>>>>>>>>>>>>>>>>> age-
> >>>>>>>>>>>>>>>>>>>> 0.7.0-dist.tar.gz
> >>>>>>>>>>>>>>>>>>>> changed on src filesystem (expected 1407509168000,
> >>>>> was
> >>>>>>>>>>>>> 1407509434000
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> What am I missing?
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> p.s.: I followed this
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> <
> >>>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>
> https://github.com/yahoo/samoa/wiki/Executing-SAMOA-with-Apache-Samz
> >>>>>>>>>>>>>>>>>>>> a>
> >>>>>>>>>>>>>>>>>>>> tutorial
> >>>>>>>>>>>>>>>>>>>> and this
> >>>>>>>>>>>>>>>>>>>> <
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>
> >> http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-
> >>>>>>>>>>>>>>>>>>>> node
> >>>>>>>>>>>>>>>>>>>> -yarn.html
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>> set up the cluster.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Help is much appreciated.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Thanks in advance.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>>>> ------------------------------------------
> >>>>>>>>>>>>>>>>>>>> Telles Mota Vidal Nobrega
> >>>>>>>>>>>>>>>>>>>> M.sc. Candidate at UFCG
> >>>>>>>>>>>>>>>>>>>> B.sc. in Computer Science at UFCG
> >>>>>>>>>>>>>>>>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> --
> >>>>>>>>>>>> ------------------------------------------
> >>>>>>>>>>>> Telles Mota Vidal Nobrega
> >>>>>>>>>>>> M.sc. Candidate at UFCG
> >>>>>>>>>>>> B.sc. in Computer Science at UFCG
> >>>>>>>>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>> ------------------------------------------
> >>>>>>>>>> Telles Mota Vidal Nobrega
> >>>>>>>>>> M.sc. Candidate at UFCG
> >>>>>>>>>> B.sc. in Computer Science at UFCG
> >>>>>>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> ------------------------------------------
> >>>>>>>> Telles Mota Vidal Nobrega
> >>>>>>>> M.sc. Candidate at UFCG
> >>>>>>>> B.sc. in Computer Science at UFCG
> >>>>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> ------------------------------------------
> >>>>> Telles Mota Vidal Nobrega
> >>>>> M.sc. Candidate at UFCG
> >>>>> B.sc. in Computer Science at UFCG
> >>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
> >>>>>
> >>>
> >>
> >>
>
>

Re: Running Job on Multinode Yarn Cluster

Posted by Telles Nobrega <te...@gmail.com>.
Same thing happens.

On 11 Aug 2014, at 21:05, Yan Fang <ya...@gmail.com> wrote:

> Cool, we are almost there. Could you remove
> 
> <property>
>  <name>fs.hdfs.impl</name>
>  <value>org.apache.hadoop.hdfs.
> DistributedFileSystem</value>
>  <description>The FileSystem for hdfs: uris.</description>
> </property>
> 
> To see how it works?
> 
> 
> Fang, Yan
> yanfang724@gmail.com
> +1 (206) 849-4108
> 
> 
> On Mon, Aug 11, 2014 at 5:03 PM, Telles Nobrega <te...@gmail.com>
> wrote:
> 
>> You may forget this last email, I was really stupid and put the files in a
>> different folder. Now it could find the file but it’s not there yet…
>> another error came up
>> 
>> Exception in thread "main" java.util.ServiceConfigurationError:
>> org.apache.hadoop.fs.FileSystem: Provider
>> org.apache.hadoop.hdfs.DistributedFileSystem could not be instantiated
>>        at java.util.ServiceLoader.fail(ServiceLoader.java:224)
>>        at java.util.ServiceLoader.access$100(ServiceLoader.java:181)
>>        at
>> java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:377)
>>        at java.util.ServiceLoader$1.next(ServiceLoader.java:445)
>>        at
>> org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2400)
>>        at
>> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2411)
>>        at
>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428)
>>        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
>>        at
>> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
>>        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
>>        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
>>        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
>>        at
>> org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.scala:111)
>>        at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
>>        at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
>>        at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
>>        at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
>>        at org.apache.samza.job.JobRunner.main(JobRunner.scala)
>> Caused by: java.lang.NoClassDefFoundError:
>> org/apache/hadoop/conf/Configuration$DeprecationDelta
>>        at
>> org.apache.hadoop.hdfs.HdfsConfiguration.addDeprecatedKeys(HdfsConfiguration.java:66)
>>        at
>> org.apache.hadoop.hdfs.HdfsConfiguration.<clinit>(HdfsConfiguration.java:31)
>>        at
>> org.apache.hadoop.hdfs.DistributedFileSystem.<clinit>(DistributedFileSystem.java:106)
>>        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>> Method)
>>        at
>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>>        at
>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>>        at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>>        at java.lang.Class.newInstance(Class.java:374)
>>        at
>> java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:373)
>>        ... 15 more
>> Caused by: java.lang.ClassNotFoundException:
>> org.apache.hadoop.conf.Configuration$DeprecationDelta
>>        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>        at java.security.AccessController.doPrivileged(Native Method)
>>        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>        at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>>        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>>        at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>>        ... 24 more
>> 
>> On 11 Aug 2014, at 20:45, Telles Nobrega <te...@gmail.com> wrote:
>> 
>>> Hi, I copied hadoop-hdfs-2.3.0 to my-job/lib and it changed the error
>> which is good but the error is back to
>>> 
>>> Exception in thread "main" java.lang.RuntimeException:
>> java.lang.ClassNotFoundException: Class
>> org.apache.hadoop.hdfs.DistributedFileSystem not found
>>>      at
>> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1720)
>>>      at
>> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2415)
>>>      at
>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428)
>>>      at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
>>>      at
>> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
>>>      at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
>>>      at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
>>>      at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
>>>      at
>> org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.scala:111)
>>>      at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
>>>      at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
>>>      at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
>>>      at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
>>>      at org.apache.samza.job.JobRunner.main(JobRunner.scala)
>>> Caused by: java.lang.ClassNotFoundException: Class
>> org.apache.hadoop.hdfs.DistributedFileSystem not found
>>>      at
>> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1626)
>>>      at
>> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1718)
>>>      ... 13 more
>>> 
>>> Do I need to have this lib in all nodes at the job folder or just to
>> submit?
>>> 
>>> On 11 Aug 2014, at 20:11, Yan Fang <ya...@gmail.com> wrote:
>>> 
>>>> Hi Telles,
>>>> 
>>>> I replayed your problem and think I figured out why CLASSPATH does not
>>>> work. Because in our script bin/run-class.sh, we have the line
>>>> "CLASSPATH=$HADOOP_CONF_DIR", which actually ingores your setting.
>>>> 
>>>> So a simple solution is to copy the hadoop-hdfs.jar to your samza lib
>>>> directory. Then run bin/run-job ----config-factory=...
>> --config-path=... .
>>>> Let me know how it goes. Thank you.
>>>> 
>>>> Cheers,
>>>> 
>>>> Fang, Yan
>>>> yanfang724@gmail.com
>>>> +1 (206) 849-4108
>>>> 
>>>> 
>>>> On Mon, Aug 11, 2014 at 4:07 PM, Telles Nobrega <
>> tellesnobrega@gmail.com>
>>>> wrote:
>>>> 
>>>>> Sure, thanks.
>>>>> 
>>>>> 
>>>>> On Mon, Aug 11, 2014 at 6:22 PM, Yan Fang <ya...@gmail.com>
>> wrote:
>>>>> 
>>>>>> Hi Telles,
>>>>>> 
>>>>>> I am not sure whether exporting the CLASSPATH works. (sometimes it
>> does
>>>>> not
>>>>>> work for me...) My suggestion is to include the hdfs jar explicitly in
>>>>> the
>>>>>> package that you upload to hdfs. Also , remember to put the jar into
>> your
>>>>>> local samza (which is deploy/samza/lib if you go with the hello-samza
>>>>>> tutorial) Let me know if that works.
>>>>>> 
>>>>>> Cheers,
>>>>>> 
>>>>>> Fang, Yan
>>>>>> yanfang724@gmail.com
>>>>>> +1 (206) 849-4108
>>>>>> 
>>>>>> 
>>>>>> On Mon, Aug 11, 2014 at 2:04 PM, Chris Riccomini <
>>>>>> criccomini@linkedin.com.invalid> wrote:
>>>>>> 
>>>>>>> Hey Telles,
>>>>>>> 
>>>>>>> Hmm. I'm out of ideas. If Zhijie is around, he'd probably be of use,
>>>>> but
>>>>>> I
>>>>>>> haven't heard from him in a while.
>>>>>>> 
>>>>>>> I'm afraid your best bet is probably to email the YARN dev mailing
>>>>> list,
>>>>>>> since this is a YARN config issue.
>>>>>>> 
>>>>>>> Cheers,
>>>>>>> Chris
>>>>>>> 
>>>>>>> On 8/11/14 1:58 PM, "Telles Nobrega" <te...@gmail.com>
>> wrote:
>>>>>>> 
>>>>>>>> ​I exported ​export
>>>>>>> 
>>>>>> 
>>>>>> 
>> CLASSPATH=$CLASSPATH:hadoop-2.3.0/share/hadoop/hdfs/hadoop-hdfs-2.3.0.jar
>>>>>>>> and still happened the same problem.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Mon, Aug 11, 2014 at 5:35 PM, Chris Riccomini <
>>>>>>>> criccomini@linkedin.com.invalid> wrote:
>>>>>>>> 
>>>>>>>>> Hey Telles,
>>>>>>>>> 
>>>>>>>>> It sounds like either the HDFS jar is missing from the classpath,
>> or
>>>>>> the
>>>>>>>>> hdfs file system needs to be configured:
>>>>>>>>> 
>>>>>>>>> <property>
>>>>>>>>> <name>fs.hdfs.impl</name>
>>>>>>>>> <value>org.apache.hadoop.hdfs.DistributedFileSystem</value>
>>>>>>>>> <description>The FileSystem for hdfs: uris.</description>
>>>>>>>>> </property>
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> (from
>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>> https://groups.google.com/a/cloudera.org/forum/#!topic/scm-users/lyho8ptA
>>>>>>>>> zE
>>>>>>>>> 0)
>>>>>>>>> 
>>>>>>>>> I believe this will need to be configured for your NM.
>>>>>>>>> 
>>>>>>>>> Cheers,
>>>>>>>>> Chris
>>>>>>>>> 
>>>>>>>>> On 8/11/14 1:31 PM, "Telles Nobrega" <te...@gmail.com>
>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Yes, it is like this:
>>>>>>>>>> 
>>>>>>>>>> <configuration>
>>>>>>>>>> <property>
>>>>>>>>>> <name>dfs.datanode.data.dir</name>
>>>>>>>>>> <value>file:///home/ubuntu/hadoop-2.3.0/hdfs/datanode</value>
>>>>>>>>>> <description>Comma separated list of paths on the local
>>>>>> filesystem
>>>>>>>>> of
>>>>>>>>>> a
>>>>>>>>>> DataNode where it should store its blocks.</description>
>>>>>>>>>> </property>
>>>>>>>>>> 
>>>>>>>>>> <property>
>>>>>>>>>> <name>dfs.namenode.name.dir</name>
>>>>>>>>>> <value>file:///home/ubuntu/hadoop-2.3.0/hdfs/namenode</value>
>>>>>>>>>> <description>Path on the local filesystem where the NameNode
>>>>>> stores
>>>>>>>>>> the
>>>>>>>>>> namespace and transaction logs persistently.</description>
>>>>>>>>>> </property>
>>>>>>>>>> </configuration>
>>>>>>>>>> ~
>>>>>>>>>> 
>>>>>>>>>> I saw some report that this may be a classpath problem. Does this
>>>>>>>>> sounds
>>>>>>>>>> right to you?
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Mon, Aug 11, 2014 at 5:25 PM, Yan Fang <ya...@gmail.com>
>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Hi Telles,
>>>>>>>>>>> 
>>>>>>>>>>> It looks correct. Did you put the hdfs-site.xml into your
>>>>>>>>>>> HADOOP_CONF_DIR
>>>>>>>>>>> ?(such as ~/.samza/conf)
>>>>>>>>>>> 
>>>>>>>>>>> Fang, Yan
>>>>>>>>>>> yanfang724@gmail.com
>>>>>>>>>>> +1 (206) 849-4108
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Mon, Aug 11, 2014 at 1:02 PM, Telles Nobrega
>>>>>>>>>>> <te...@gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> ​Hi Yan Fang,
>>>>>>>>>>>> 
>>>>>>>>>>>> I was able to deploy the file to hdfs, I can see them in all my
>>>>>>>>> nodes
>>>>>>>>>>> but
>>>>>>>>>>>> when I tried running I got this error:
>>>>>>>>>>>> 
>>>>>>>>>>>> Exception in thread "main" java.io.IOException: No FileSystem
>>>>> for
>>>>>>>>>>> scheme:
>>>>>>>>>>>> hdfs
>>>>>>>>>>>> at
>>>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>> 
>> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2421)
>>>>>>>>>>>> at
>>>>>>>>>>> 
>>>>>>> 
>>>>>>> 
>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428)
>>>>>>>>>>>> at
>>>>> org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
>>>>>>>>>>>> at
>>>>>>>>>>> 
>>>>>>> 
>>>>>>> 
>> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
>>>>>>>>>>>> at
>>>>>> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
>>>>>>>>>>>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
>>>>>>>>>>>> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
>>>>>>>>>>>> at
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>> 
>> org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.s
>>>>>>>>>>> ca
>>>>>>>>>>> la:111)
>>>>>>>>>>>> at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
>>>>>>>>>>>> at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
>>>>>>>>>>>> at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
>>>>>>>>>>>> at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
>>>>>>>>>>>> at org.apache.samza.job.JobRunner.main(JobRunner.scala)
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> This is my yarn.package.path config:
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>> 
>> ​yarn.package.path=hdfs://telles-master-samza:50070/samza-job-package-0
>>>>>>>>>>> .7
>>>>>>>>>>> .0-dist.tar.gz
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks in advance
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On Mon, Aug 11, 2014 at 3:00 PM, Yan Fang <
>>>>> yanfang724@gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi Telles,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> In terms of "*I tried pushing the tar file to HDFS but I got
>>>>> an
>>>>>>>>>>> error
>>>>>>>>>>>> from
>>>>>>>>>>>>> hadoop saying that it couldn’t find core-site.xml file*.", I
>>>>>>>>> guess
>>>>>>>>>>> you
>>>>>>>>>>>> set
>>>>>>>>>>>>> the HADOOP_CONF_DIR variable and made it point to
>>>>>> ~/.samza/conf.
>>>>>>>>> You
>>>>>>>>>>> can
>>>>>>>>>>>> do
>>>>>>>>>>>>> 1) make the HADOOP_CONF_DIR point to the directory where your
>>>>>>>>> conf
>>>>>>>>>>> files
>>>>>>>>>>>>> are, such as /etc/hadoop/conf. Or 2) copy the config files to
>>>>>>>>>>>>> ~/.samza/conf. Thank you,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Cheer,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Fang, Yan
>>>>>>>>>>>>> yanfang724@gmail.com
>>>>>>>>>>>>> +1 (206) 849-4108
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Mon, Aug 11, 2014 at 7:40 AM, Chris Riccomini <
>>>>>>>>>>>>> criccomini@linkedin.com.invalid> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hey Telles,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> To get YARN working with the HTTP file system, you need to
>>>>>>>>> follow
>>>>>>>>>>> the
>>>>>>>>>>>>>> instructions on:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>> http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-node
>>>>>>>>>>> -y
>>>>>>>>>>>>>> arn.html
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> In the "Set Up Http Filesystem for YARN" section.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> You shouldn't need to compile anything (no Gradle, which is
>>>>>>>>> what
>>>>>>>>>>> your
>>>>>>>>>>>>>> stack trace is showing). This setup should be done for all
>>>>> of
>>>>>>>>> the
>>>>>>>>>>> NMs,
>>>>>>>>>>>>>> since they will be the ones downloading your job's package
>>>>>>>>> (from
>>>>>>>>>>>>>> yarn.package.path).
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On 8/9/14 9:44 PM, "Telles Nobrega" <
>>>>> tellesnobrega@gmail.com
>>>>>>> 
>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Hi again, I tried installing the scala libs but the Http
>>>>>>>>> problem
>>>>>>>>>>> still
>>>>>>>>>>>>>>> occurs. I realised that I need to compile incubator samza
>>>>> in
>>>>>>>>> the
>>>>>>>>>>>>> machines
>>>>>>>>>>>>>>> that I¹m going to run the jobs, but the compilation fails
>>>>>> with
>>>>>>>>>>> this
>>>>>>>>>>>> huge
>>>>>>>>>>>>>>> message:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>> # There is insufficient memory for the Java Runtime
>>>>>>>>> Environment
>>>>>>>>>>> to
>>>>>>>>>>>>>>> continue.
>>>>>>>>>>>>>>> # Native memory allocation (malloc) failed to allocate
>>>>>>>>> 3946053632
>>>>>>>>>>>> bytes
>>>>>>>>>>>>>>> for committing reserved memory.
>>>>>>>>>>>>>>> # An error report file with more information is saved as:
>>>>>>>>>>>>>>> #
>>>>>> /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2506.log
>>>>>>>>>>>>>>> Could not write standard input into: Gradle Worker 13.
>>>>>>>>>>>>>>> java.io.IOException: Broken pipe
>>>>>>>>>>>>>>>    at java.io.FileOutputStream.writeBytes(Native
>>>>> Method)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>> java.io.FileOutputStream.write(FileOutputStream.java:345)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>> 
>>>>>>> 
>> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>> java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOut
>>>>>>>>>>>> pu
>>>>>>>>>>>> tH
>>>>>>>>>>>>>>> andleRunner.java:53)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecuto
>>>>>>>>>>>> rI
>>>>>>>>>>>> mp
>>>>>>>>>>>>>>> l$1.run(DefaultExecutorFactory.java:66)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
>>>>>>>>>>>> av
>>>>>>>>>>>> a:
>>>>>>>>>>>>>>> 1145)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
>>>>>>>>>>>> ja
>>>>>>>>>>>> va
>>>>>>>>>>>>>>> :615)
>>>>>>>>>>>>>>>    at java.lang.Thread.run(Thread.java:744)
>>>>>>>>>>>>>>> Process 'Gradle Worker 13' finished with non-zero exit
>>>>>> value 1
>>>>>>>>>>>>>>> org.gradle.process.internal.ExecException: Process 'Gradle
>>>>>>>>> Worker
>>>>>>>>>>> 13'
>>>>>>>>>>>>>>> finished with non-zero exit value 1
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNor
>>>>>>>>>>>> ma
>>>>>>>>>>>> lE
>>>>>>>>>>>>>>> xitValue(DefaultExecHandle.java:362)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(Default
>>>>>>>>>>>> Wo
>>>>>>>>>>>> rk
>>>>>>>>>>>>>>> erProcess.java:89)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWor
>>>>>>>>>>>> ke
>>>>>>>>>>>> rP
>>>>>>>>>>>>>>> rocess.java:33)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(D
>>>>>>>>>>>> ef
>>>>>>>>>>>> au
>>>>>>>>>>>>>>> ltWorkerProcess.java:55)
>>>>>>>>>>>>>>>    at
>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>>>>>>>>>>> Method)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
>>>>>>>>>>>> av
>>>>>>>>>>>> a:
>>>>>>>>>>>>>>> 57)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
>>>>>>>>>>>> or
>>>>>>>>>>>> Im
>>>>>>>>>>>>>>> pl.java:43)
>>>>>>>>>>>>>>>    at java.lang.reflect.Method.invoke(Method.java:606)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDi
>>>>>>>>>>>> sp
>>>>>>>>>>>> at
>>>>>>>>>>>>>>> ch.java:35)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDi
>>>>>>>>>>>> sp
>>>>>>>>>>>> at
>>>>>>>>>>>>>>> ch.java:24)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:
>>>>>>>>>>>> 81
>>>>>>>>>>>> )
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:
>>>>>>>>>>>> 30
>>>>>>>>>>>> )
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocati
>>>>>>>>>>>> on
>>>>>>>>>>>> Ha
>>>>>>>>>>>>>>> ndler.invoke(ProxyDispatchAdapter.java:93)
>>>>>>>>>>>>>>>    at com.sun.proxy.$Proxy46.executionFinished(Unknown
>>>>>>>>>>> Source)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultE
>>>>>>>>>>>> xe
>>>>>>>>>>>> cH
>>>>>>>>>>>>>>> andle.java:212)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHand
>>>>>>>>>>>> le
>>>>>>>>>>>> .j
>>>>>>>>>>>>>>> ava:309)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunne
>>>>>>>>>>>> r.
>>>>>>>>>>>> ja
>>>>>>>>>>>>>>> va:108)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java
>>>>>>>>>>>> :8
>>>>>>>>>>>> 8)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecuto
>>>>>>>>>>>> rI
>>>>>>>>>>>> mp
>>>>>>>>>>>>>>> l$1.run(DefaultExecutorFactory.java:66)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
>>>>>>>>>>>> av
>>>>>>>>>>>> a:
>>>>>>>>>>>>>>> 1145)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
>>>>>>>>>>>> ja
>>>>>>>>>>>> va
>>>>>>>>>>>>>>> :615)
>>>>>>>>>>>>>>>    at java.lang.Thread.run(Thread.java:744)
>>>>>>>>>>>>>>> OpenJDK 64-Bit Server VM warning: INFO:
>>>>>>>>>>>>>>> os::commit_memory(0x000000070a6c0000, 3946053632, 0)
>>>>> failed;
>>>>>>>>>>>>>>> error='Cannot allocate memory' (errno=12)
>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>> # There is insufficient memory for the Java Runtime
>>>>>>>>> Environment
>>>>>>>>>>> to
>>>>>>>>>>>>>>> continue.
>>>>>>>>>>>>>>> # Native memory allocation (malloc) failed to allocate
>>>>>>>>> 3946053632
>>>>>>>>>>>> bytes
>>>>>>>>>>>>>>> for committing reserved memory.
>>>>>>>>>>>>>>> # An error report file with more information is saved as:
>>>>>>>>>>>>>>> #
>>>>>> /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2518.log
>>>>>>>>>>>>>>> Could not write standard input into: Gradle Worker 14.
>>>>>>>>>>>>>>> java.io.IOException: Broken pipe
>>>>>>>>>>>>>>>    at java.io.FileOutputStream.writeBytes(Native
>>>>> Method)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>> java.io.FileOutputStream.write(FileOutputStream.java:345)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>> 
>>>>>>> 
>> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>> java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOut
>>>>>>>>>>>> pu
>>>>>>>>>>>> tH
>>>>>>>>>>>>>>> andleRunner.java:53)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecuto
>>>>>>>>>>>> rI
>>>>>>>>>>>> mp
>>>>>>>>>>>>>>> l$1.run(DefaultExecutorFactory.java:66)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
>>>>>>>>>>>> av
>>>>>>>>>>>> a:
>>>>>>>>>>>>>>> 1145)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
>>>>>>>>>>>> ja
>>>>>>>>>>>> va
>>>>>>>>>>>>>>> :615)
>>>>>>>>>>>>>>>    at java.lang.Thread.run(Thread.java:744)
>>>>>>>>>>>>>>> Process 'Gradle Worker 14' finished with non-zero exit
>>>>>> value 1
>>>>>>>>>>>>>>> org.gradle.process.internal.ExecException: Process 'Gradle
>>>>>>>>> Worker
>>>>>>>>>>> 14'
>>>>>>>>>>>>>>> finished with non-zero exit value 1
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNor
>>>>>>>>>>>> ma
>>>>>>>>>>>> lE
>>>>>>>>>>>>>>> xitValue(DefaultExecHandle.java:362)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(Default
>>>>>>>>>>>> Wo
>>>>>>>>>>>> rk
>>>>>>>>>>>>>>> erProcess.java:89)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWor
>>>>>>>>>>>> ke
>>>>>>>>>>>> rP
>>>>>>>>>>>>>>> rocess.java:33)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(D
>>>>>>>>>>>> ef
>>>>>>>>>>>> au
>>>>>>>>>>>>>>> ltWorkerProcess.java:55)
>>>>>>>>>>>>>>>    at
>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>>>>>>>>>>> Method)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
>>>>>>>>>>>> av
>>>>>>>>>>>> a:
>>>>>>>>>>>>>>> 57)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
>>>>>>>>>>>> or
>>>>>>>>>>>> Im
>>>>>>>>>>>>>>> pl.java:43)
>>>>>>>>>>>>>>>    at java.lang.reflect.Method.invoke(Method.java:606)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDi
>>>>>>>>>>>> sp
>>>>>>>>>>>> at
>>>>>>>>>>>>>>> ch.java:35)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDi
>>>>>>>>>>>> sp
>>>>>>>>>>>> at
>>>>>>>>>>>>>>> ch.java:24)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:
>>>>>>>>>>>> 81
>>>>>>>>>>>> )
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:
>>>>>>>>>>>> 30
>>>>>>>>>>>> )
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocati
>>>>>>>>>>>> on
>>>>>>>>>>>> Ha
>>>>>>>>>>>>>>> ndler.invoke(ProxyDispatchAdapter.java:93)
>>>>>>>>>>>>>>>    at com.sun.proxy.$Proxy46.executionFinished(Unknown
>>>>>>>>>>> Source)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultE
>>>>>>>>>>>> xe
>>>>>>>>>>>> cH
>>>>>>>>>>>>>>> andle.java:212)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHand
>>>>>>>>>>>> le
>>>>>>>>>>>> .j
>>>>>>>>>>>>>>> ava:309)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunne
>>>>>>>>>>>> r.
>>>>>>>>>>>> ja
>>>>>>>>>>>>>>> va:108)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java
>>>>>>>>>>>> :8
>>>>>>>>>>>> 8)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecuto
>>>>>>>>>>>> rI
>>>>>>>>>>>> mp
>>>>>>>>>>>>>>> l$1.run(DefaultExecutorFactory.java:66)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
>>>>>>>>>>>> av
>>>>>>>>>>>> a:
>>>>>>>>>>>>>>> 1145)
>>>>>>>>>>>>>>>    at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>> 
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
>>>>>>>>>>>> ja
>>>>>>>>>>>> va
>>>>>>>>>>>>>>> :615)
>>>>>>>>>>>>>>>    at java.lang.Thread.r
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Do I need more memory for my machines? Each already has
>>>>>> 4GB. I
>>>>>>>>>>> really
>>>>>>>>>>>>>>> need to have this running. I¹m not sure which way is best
>>>>>>>>> http or
>>>>>>>>>>> hdfs
>>>>>>>>>>>>>>> which one you suggest and how can i solve my problem for
>>>>>> each
>>>>>>>>>>> case.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Thanks in advance and sorry for bothering this much.
>>>>>>>>>>>>>>> On 10 Aug 2014, at 00:20, Telles Nobrega
>>>>>>>>>>> <te...@gmail.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Hi Chris, now I have the tar file in my RM machine, and
>>>>>> the
>>>>>>>>>>> yarn
>>>>>>>>>>>> path
>>>>>>>>>>>>>>>> points to it. I changed the core-site.xml to use
>>>>>>>>> HttpFileSystem
>>>>>>>>>>>> instead
>>>>>>>>>>>>>>>> of HDFS now it is failing with
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Application application_1407640485281_0001 failed 2
>>>>> times
>>>>>>>>> due
>>>>>>>>>>> to
>>>>>>>>>>> AM
>>>>>>>>>>>>>>>> Container for appattempt_1407640485281_0001_000002 exited
>>>>>>>>> with
>>>>>>>>>>>>>>>> exitCode:-1000 due to: java.lang.ClassNotFoundException:
>>>>>>>>> Class
>>>>>>>>>>>>>>>> org.apache.samza.util.hadoop.HttpFileSystem not found
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> I think I can solve this just installing scala files
>>>>> from
>>>>>>>>> the
>>>>>>>>>>> samza
>>>>>>>>>>>>>>>> tutorial, can you confirm that?
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On 09 Aug 2014, at 08:34, Telles Nobrega
>>>>>>>>>>> <tellesnobrega@gmail.com
>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Hi Chris,
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> I think the problem is that I forgot to update the
>>>>>>>>>>>> yarn.job.package.
>>>>>>>>>>>>>>>>> I will try again to see if it works now.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> I have one more question, how can I stop (command line)
>>>>>> the
>>>>>>>>>>> jobs
>>>>>>>>>>>>>>>>> running in my topology, for the experiment that I will
>>>>>> run,
>>>>>>>>> I
>>>>>>>>>>> need
>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>> run the same job in 4 minutes intervals. So I need to
>>>>> kill
>>>>>>>>> it,
>>>>>>>>>>> clean
>>>>>>>>>>>>>>>>> the kafka topics and rerun.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Thanks in advance.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> On 08 Aug 2014, at 12:41, Chris Riccomini
>>>>>>>>>>>>>>>>> <cr...@linkedin.com.INVALID> wrote:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Hey Telles,
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Do I need to have the job folder on each machine in
>>>>> my
>>>>>>>>>>> cluster?
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> No, you should not need to do this. There are two ways
>>>>>> to
>>>>>>>>>>> deploy
>>>>>>>>>>>>> your
>>>>>>>>>>>>>>>>>> tarball to the YARN grid. One is to put it in HDFS,
>>>>> and
>>>>>>>>> the
>>>>>>>>>>> other
>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>> put it on an HTTP server. The link to running a Samza
>>>>>> job
>>>>>>>>> in
>>>>>>>>>>> a
>>>>>>>>>>>>>>>>>> multi-node
>>>>>>>>>>>>>>>>>> YARN cluster describes how to do both (either HTTP
>>>>>> server
>>>>>>>>> or
>>>>>>>>>>>> HDFS).
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> In both cases, once the tarball is put in on the
>>>>>> HTTP/HDFS
>>>>>>>>>>>>> server(s),
>>>>>>>>>>>>>>>>>> you
>>>>>>>>>>>>>>>>>> must update yarn.package.path to point to it. From
>>>>>> there,
>>>>>>>>> the
>>>>>>>>>>> YARN
>>>>>>>>>>>>> NM
>>>>>>>>>>>>>>>>>> should download it for you automatically when you
>>>>> start
>>>>>>>>> your
>>>>>>>>>>> job.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> * Can you send along a paste of your job config?
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> On 8/8/14 8:04 AM, "Claudio Martins"
>>>>>>>>>>> <cl...@mobileaware.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Hi Telles, it looks to me that you forgot to update
>>>>> the
>>>>>>>>>>>>>>>>>>> "yarn.package.path"
>>>>>>>>>>>>>>>>>>> attribute in your config file for the task.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> - Claudio Martins
>>>>>>>>>>>>>>>>>>> Head of Engineering
>>>>>>>>>>>>>>>>>>> MobileAware USA Inc. / www.mobileaware.com
>>>>>>>>>>>>>>>>>>> office: +1 617 986 5060 / mobile: +1 617 480 5288
>>>>>>>>>>>>>>>>>>> linkedin: www.linkedin.com/in/martinsclaudio
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> On Fri, Aug 8, 2014 at 10:55 AM, Telles Nobrega
>>>>>>>>>>>>>>>>>>> <te...@gmail.com>
>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> this is my first time trying to run a job on a
>>>>>> multinode
>>>>>>>>>>>>>>>>>>>> environment. I
>>>>>>>>>>>>>>>>>>>> have the cluster set up, I can see in the GUI that
>>>>> all
>>>>>>>>>>> nodes
>>>>>>>>>>> are
>>>>>>>>>>>>>>>>>>>> working.
>>>>>>>>>>>>>>>>>>>> Do I need to have the job folder on each machine in
>>>>> my
>>>>>>>>>>> cluster?
>>>>>>>>>>>>>>>>>>>> - The first time I tried running with the job on the
>>>>>>>>>>> namenode
>>>>>>>>>>>>>>>>>>>> machine
>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>> it failed saying:
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Application application_1407509228798_0001 failed 2
>>>>>>>>> times
>>>>>>>>>>> due
>>>>>>>>>>> to
>>>>>>>>>>>>> AM
>>>>>>>>>>>>>>>>>>>> Container for appattempt_1407509228798_0001_000002
>>>>>>>>> exited
>>>>>>>>>>> with
>>>>>>>>>>>>>>>>>>>> exitCode:
>>>>>>>>>>>>>>>>>>>> -1000 due to: File
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>>>>>>> 
>> file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-
>>>>>>>>>>>>>>>>> pa
>>>>>>>>>>>>>>>>> ck
>>>>>>>>>>>>>>>>>>>> age-
>>>>>>>>>>>>>>>>>>>> 0.7.0-dist.tar.gz
>>>>>>>>>>>>>>>>>>>> does not exist
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> So I copied the folder to each machine in my cluster
>>>>>> and
>>>>>>>>>>> got
>>>>>>>>>>>> this
>>>>>>>>>>>>>>>>>>>> error:
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Application application_1407509228798_0002 failed 2
>>>>>>>>> times
>>>>>>>>>>> due
>>>>>>>>>>> to
>>>>>>>>>>>>> AM
>>>>>>>>>>>>>>>>>>>> Container for appattempt_1407509228798_0002_000002
>>>>>>>>> exited
>>>>>>>>>>> with
>>>>>>>>>>>>>>>>>>>> exitCode:
>>>>>>>>>>>>>>>>>>>> -1000 due to: Resource
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>>>>>>>>>>> 
>> file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-
>>>>>>>>>>>>>>>>> pa
>>>>>>>>>>>>>>>>> ck
>>>>>>>>>>>>>>>>>>>> age-
>>>>>>>>>>>>>>>>>>>> 0.7.0-dist.tar.gz
>>>>>>>>>>>>>>>>>>>> changed on src filesystem (expected 1407509168000,
>>>>> was
>>>>>>>>>>>>> 1407509434000
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> What am I missing?
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> p.s.: I followed this
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> <
>>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>> https://github.com/yahoo/samoa/wiki/Executing-SAMOA-with-Apache-Samz
>>>>>>>>>>>>>>>>>>>> a>
>>>>>>>>>>>>>>>>>>>> tutorial
>>>>>>>>>>>>>>>>>>>> and this
>>>>>>>>>>>>>>>>>>>> <
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>> 
>> http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-
>>>>>>>>>>>>>>>>>>>> node
>>>>>>>>>>>>>>>>>>>> -yarn.html
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>> set up the cluster.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Help is much appreciated.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Thanks in advance.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>> ------------------------------------------
>>>>>>>>>>>>>>>>>>>> Telles Mota Vidal Nobrega
>>>>>>>>>>>>>>>>>>>> M.sc. Candidate at UFCG
>>>>>>>>>>>>>>>>>>>> B.sc. in Computer Science at UFCG
>>>>>>>>>>>>>>>>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> --
>>>>>>>>>>>> ------------------------------------------
>>>>>>>>>>>> Telles Mota Vidal Nobrega
>>>>>>>>>>>> M.sc. Candidate at UFCG
>>>>>>>>>>>> B.sc. in Computer Science at UFCG
>>>>>>>>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> --
>>>>>>>>>> ------------------------------------------
>>>>>>>>>> Telles Mota Vidal Nobrega
>>>>>>>>>> M.sc. Candidate at UFCG
>>>>>>>>>> B.sc. in Computer Science at UFCG
>>>>>>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> ------------------------------------------
>>>>>>>> Telles Mota Vidal Nobrega
>>>>>>>> M.sc. Candidate at UFCG
>>>>>>>> B.sc. in Computer Science at UFCG
>>>>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> ------------------------------------------
>>>>> Telles Mota Vidal Nobrega
>>>>> M.sc. Candidate at UFCG
>>>>> B.sc. in Computer Science at UFCG
>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
>>>>> 
>>> 
>> 
>> 


Re: Running Job on Multinode Yarn Cluster

Posted by Yan Fang <ya...@gmail.com>.
Cool, we are almost there. Could you remove

<property>
  <name>fs.hdfs.impl</name>
  <value>org.apache.hadoop.hdfs.
DistributedFileSystem</value>
  <description>The FileSystem for hdfs: uris.</description>
</property>

To see how it works?


Fang, Yan
yanfang724@gmail.com
+1 (206) 849-4108


On Mon, Aug 11, 2014 at 5:03 PM, Telles Nobrega <te...@gmail.com>
wrote:

> You may forget this last email, I was really stupid and put the files in a
> different folder. Now it could find the file but it’s not there yet…
> another error came up
>
> Exception in thread "main" java.util.ServiceConfigurationError:
> org.apache.hadoop.fs.FileSystem: Provider
> org.apache.hadoop.hdfs.DistributedFileSystem could not be instantiated
>         at java.util.ServiceLoader.fail(ServiceLoader.java:224)
>         at java.util.ServiceLoader.access$100(ServiceLoader.java:181)
>         at
> java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:377)
>         at java.util.ServiceLoader$1.next(ServiceLoader.java:445)
>         at
> org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2400)
>         at
> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2411)
>         at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428)
>         at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
>         at
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
>         at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
>         at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
>         at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
>         at
> org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.scala:111)
>         at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
>         at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
>         at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
>         at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
>         at org.apache.samza.job.JobRunner.main(JobRunner.scala)
> Caused by: java.lang.NoClassDefFoundError:
> org/apache/hadoop/conf/Configuration$DeprecationDelta
>         at
> org.apache.hadoop.hdfs.HdfsConfiguration.addDeprecatedKeys(HdfsConfiguration.java:66)
>         at
> org.apache.hadoop.hdfs.HdfsConfiguration.<clinit>(HdfsConfiguration.java:31)
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.<clinit>(DistributedFileSystem.java:106)
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
>         at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>         at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>         at java.lang.Class.newInstance(Class.java:374)
>         at
> java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:373)
>         ... 15 more
> Caused by: java.lang.ClassNotFoundException:
> org.apache.hadoop.conf.Configuration$DeprecationDelta
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>         ... 24 more
>
> On 11 Aug 2014, at 20:45, Telles Nobrega <te...@gmail.com> wrote:
>
> > Hi, I copied hadoop-hdfs-2.3.0 to my-job/lib and it changed the error
> which is good but the error is back to
> >
> > Exception in thread "main" java.lang.RuntimeException:
> java.lang.ClassNotFoundException: Class
> org.apache.hadoop.hdfs.DistributedFileSystem not found
> >       at
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1720)
> >       at
> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2415)
> >       at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428)
> >       at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
> >       at
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
> >       at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
> >       at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
> >       at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
> >       at
> org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.scala:111)
> >       at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
> >       at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
> >       at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
> >       at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
> >       at org.apache.samza.job.JobRunner.main(JobRunner.scala)
> > Caused by: java.lang.ClassNotFoundException: Class
> org.apache.hadoop.hdfs.DistributedFileSystem not found
> >       at
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1626)
> >       at
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1718)
> >       ... 13 more
> >
> > Do I need to have this lib in all nodes at the job folder or just to
> submit?
> >
> > On 11 Aug 2014, at 20:11, Yan Fang <ya...@gmail.com> wrote:
> >
> >> Hi Telles,
> >>
> >> I replayed your problem and think I figured out why CLASSPATH does not
> >> work. Because in our script bin/run-class.sh, we have the line
> >> "CLASSPATH=$HADOOP_CONF_DIR", which actually ingores your setting.
> >>
> >> So a simple solution is to copy the hadoop-hdfs.jar to your samza lib
> >> directory. Then run bin/run-job ----config-factory=...
> --config-path=... .
> >> Let me know how it goes. Thank you.
> >>
> >> Cheers,
> >>
> >> Fang, Yan
> >> yanfang724@gmail.com
> >> +1 (206) 849-4108
> >>
> >>
> >> On Mon, Aug 11, 2014 at 4:07 PM, Telles Nobrega <
> tellesnobrega@gmail.com>
> >> wrote:
> >>
> >>> Sure, thanks.
> >>>
> >>>
> >>> On Mon, Aug 11, 2014 at 6:22 PM, Yan Fang <ya...@gmail.com>
> wrote:
> >>>
> >>>> Hi Telles,
> >>>>
> >>>> I am not sure whether exporting the CLASSPATH works. (sometimes it
> does
> >>> not
> >>>> work for me...) My suggestion is to include the hdfs jar explicitly in
> >>> the
> >>>> package that you upload to hdfs. Also , remember to put the jar into
> your
> >>>> local samza (which is deploy/samza/lib if you go with the hello-samza
> >>>> tutorial) Let me know if that works.
> >>>>
> >>>> Cheers,
> >>>>
> >>>> Fang, Yan
> >>>> yanfang724@gmail.com
> >>>> +1 (206) 849-4108
> >>>>
> >>>>
> >>>> On Mon, Aug 11, 2014 at 2:04 PM, Chris Riccomini <
> >>>> criccomini@linkedin.com.invalid> wrote:
> >>>>
> >>>>> Hey Telles,
> >>>>>
> >>>>> Hmm. I'm out of ideas. If Zhijie is around, he'd probably be of use,
> >>> but
> >>>> I
> >>>>> haven't heard from him in a while.
> >>>>>
> >>>>> I'm afraid your best bet is probably to email the YARN dev mailing
> >>> list,
> >>>>> since this is a YARN config issue.
> >>>>>
> >>>>> Cheers,
> >>>>> Chris
> >>>>>
> >>>>> On 8/11/14 1:58 PM, "Telles Nobrega" <te...@gmail.com>
> wrote:
> >>>>>
> >>>>>> ​I exported ​export
> >>>>>
> >>>>
> >>>>
> CLASSPATH=$CLASSPATH:hadoop-2.3.0/share/hadoop/hdfs/hadoop-hdfs-2.3.0.jar
> >>>>>> and still happened the same problem.
> >>>>>>
> >>>>>>
> >>>>>> On Mon, Aug 11, 2014 at 5:35 PM, Chris Riccomini <
> >>>>>> criccomini@linkedin.com.invalid> wrote:
> >>>>>>
> >>>>>>> Hey Telles,
> >>>>>>>
> >>>>>>> It sounds like either the HDFS jar is missing from the classpath,
> or
> >>>> the
> >>>>>>> hdfs file system needs to be configured:
> >>>>>>>
> >>>>>>> <property>
> >>>>>>> <name>fs.hdfs.impl</name>
> >>>>>>> <value>org.apache.hadoop.hdfs.DistributedFileSystem</value>
> >>>>>>> <description>The FileSystem for hdfs: uris.</description>
> >>>>>>> </property>
> >>>>>>>
> >>>>>>>
> >>>>>>> (from
> >>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>
> https://groups.google.com/a/cloudera.org/forum/#!topic/scm-users/lyho8ptA
> >>>>>>> zE
> >>>>>>> 0)
> >>>>>>>
> >>>>>>> I believe this will need to be configured for your NM.
> >>>>>>>
> >>>>>>> Cheers,
> >>>>>>> Chris
> >>>>>>>
> >>>>>>> On 8/11/14 1:31 PM, "Telles Nobrega" <te...@gmail.com>
> >>> wrote:
> >>>>>>>
> >>>>>>>> Yes, it is like this:
> >>>>>>>>
> >>>>>>>> <configuration>
> >>>>>>>> <property>
> >>>>>>>>  <name>dfs.datanode.data.dir</name>
> >>>>>>>>  <value>file:///home/ubuntu/hadoop-2.3.0/hdfs/datanode</value>
> >>>>>>>>  <description>Comma separated list of paths on the local
> >>>> filesystem
> >>>>>>> of
> >>>>>>>> a
> >>>>>>>> DataNode where it should store its blocks.</description>
> >>>>>>>> </property>
> >>>>>>>>
> >>>>>>>> <property>
> >>>>>>>>  <name>dfs.namenode.name.dir</name>
> >>>>>>>>  <value>file:///home/ubuntu/hadoop-2.3.0/hdfs/namenode</value>
> >>>>>>>>  <description>Path on the local filesystem where the NameNode
> >>>> stores
> >>>>>>>> the
> >>>>>>>> namespace and transaction logs persistently.</description>
> >>>>>>>> </property>
> >>>>>>>> </configuration>
> >>>>>>>> ~
> >>>>>>>>
> >>>>>>>> I saw some report that this may be a classpath problem. Does this
> >>>>>>> sounds
> >>>>>>>> right to you?
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Mon, Aug 11, 2014 at 5:25 PM, Yan Fang <ya...@gmail.com>
> >>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Hi Telles,
> >>>>>>>>>
> >>>>>>>>> It looks correct. Did you put the hdfs-site.xml into your
> >>>>>>>>> HADOOP_CONF_DIR
> >>>>>>>>> ?(such as ~/.samza/conf)
> >>>>>>>>>
> >>>>>>>>> Fang, Yan
> >>>>>>>>> yanfang724@gmail.com
> >>>>>>>>> +1 (206) 849-4108
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Mon, Aug 11, 2014 at 1:02 PM, Telles Nobrega
> >>>>>>>>> <te...@gmail.com>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> ​Hi Yan Fang,
> >>>>>>>>>>
> >>>>>>>>>> I was able to deploy the file to hdfs, I can see them in all my
> >>>>>>> nodes
> >>>>>>>>> but
> >>>>>>>>>> when I tried running I got this error:
> >>>>>>>>>>
> >>>>>>>>>> Exception in thread "main" java.io.IOException: No FileSystem
> >>> for
> >>>>>>>>> scheme:
> >>>>>>>>>> hdfs
> >>>>>>>>>> at
> >>>>>>>>>
> >>>>>
> >>>>
> >>>>>
> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2421)
> >>>>>>>>>> at
> >>>>>>>>>
> >>>>>
> >>>>>
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428)
> >>>>>>>>>> at
> >>> org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
> >>>>>>>>>> at
> >>>>>>>>>
> >>>>>
> >>>>>
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
> >>>>>>>>>> at
> >>>> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
> >>>>>>>>>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
> >>>>>>>>>> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
> >>>>>>>>>> at
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>
> org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.s
> >>>>>>>>> ca
> >>>>>>>>> la:111)
> >>>>>>>>>> at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
> >>>>>>>>>> at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
> >>>>>>>>>> at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
> >>>>>>>>>> at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
> >>>>>>>>>> at org.apache.samza.job.JobRunner.main(JobRunner.scala)
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> This is my yarn.package.path config:
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>
> ​yarn.package.path=hdfs://telles-master-samza:50070/samza-job-package-0
> >>>>>>>>> .7
> >>>>>>>>> .0-dist.tar.gz
> >>>>>>>>>>
> >>>>>>>>>> Thanks in advance
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Mon, Aug 11, 2014 at 3:00 PM, Yan Fang <
> >>> yanfang724@gmail.com>
> >>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Hi Telles,
> >>>>>>>>>>>
> >>>>>>>>>>> In terms of "*I tried pushing the tar file to HDFS but I got
> >>> an
> >>>>>>>>> error
> >>>>>>>>>> from
> >>>>>>>>>>> hadoop saying that it couldn’t find core-site.xml file*.", I
> >>>>>>> guess
> >>>>>>>>> you
> >>>>>>>>>> set
> >>>>>>>>>>> the HADOOP_CONF_DIR variable and made it point to
> >>>> ~/.samza/conf.
> >>>>>>> You
> >>>>>>>>> can
> >>>>>>>>>> do
> >>>>>>>>>>> 1) make the HADOOP_CONF_DIR point to the directory where your
> >>>>>>> conf
> >>>>>>>>> files
> >>>>>>>>>>> are, such as /etc/hadoop/conf. Or 2) copy the config files to
> >>>>>>>>>>> ~/.samza/conf. Thank you,
> >>>>>>>>>>>
> >>>>>>>>>>> Cheer,
> >>>>>>>>>>>
> >>>>>>>>>>> Fang, Yan
> >>>>>>>>>>> yanfang724@gmail.com
> >>>>>>>>>>> +1 (206) 849-4108
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Mon, Aug 11, 2014 at 7:40 AM, Chris Riccomini <
> >>>>>>>>>>> criccomini@linkedin.com.invalid> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Hey Telles,
> >>>>>>>>>>>>
> >>>>>>>>>>>> To get YARN working with the HTTP file system, you need to
> >>>>>>> follow
> >>>>>>>>> the
> >>>>>>>>>>>> instructions on:
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>
> http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-node
> >>>>>>>>> -y
> >>>>>>>>>>>> arn.html
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> In the "Set Up Http Filesystem for YARN" section.
> >>>>>>>>>>>>
> >>>>>>>>>>>> You shouldn't need to compile anything (no Gradle, which is
> >>>>>>> what
> >>>>>>>>> your
> >>>>>>>>>>>> stack trace is showing). This setup should be done for all
> >>> of
> >>>>>>> the
> >>>>>>>>> NMs,
> >>>>>>>>>>>> since they will be the ones downloading your job's package
> >>>>>>> (from
> >>>>>>>>>>>> yarn.package.path).
> >>>>>>>>>>>>
> >>>>>>>>>>>> Cheers,
> >>>>>>>>>>>> Chris
> >>>>>>>>>>>>
> >>>>>>>>>>>> On 8/9/14 9:44 PM, "Telles Nobrega" <
> >>> tellesnobrega@gmail.com
> >>>>>
> >>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Hi again, I tried installing the scala libs but the Http
> >>>>>>> problem
> >>>>>>>>> still
> >>>>>>>>>>>>> occurs. I realised that I need to compile incubator samza
> >>> in
> >>>>>>> the
> >>>>>>>>>>> machines
> >>>>>>>>>>>>> that I¹m going to run the jobs, but the compilation fails
> >>>> with
> >>>>>>>>> this
> >>>>>>>>>> huge
> >>>>>>>>>>>>> message:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> #
> >>>>>>>>>>>>> # There is insufficient memory for the Java Runtime
> >>>>>>> Environment
> >>>>>>>>> to
> >>>>>>>>>>>>> continue.
> >>>>>>>>>>>>> # Native memory allocation (malloc) failed to allocate
> >>>>>>> 3946053632
> >>>>>>>>>> bytes
> >>>>>>>>>>>>> for committing reserved memory.
> >>>>>>>>>>>>> # An error report file with more information is saved as:
> >>>>>>>>>>>>> #
> >>>> /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2506.log
> >>>>>>>>>>>>> Could not write standard input into: Gradle Worker 13.
> >>>>>>>>>>>>> java.io.IOException: Broken pipe
> >>>>>>>>>>>>>     at java.io.FileOutputStream.writeBytes(Native
> >>> Method)
> >>>>>>>>>>>>>     at
> >>>>>>>>> java.io.FileOutputStream.write(FileOutputStream.java:345)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>
> >>>>>
> >>>>>
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>> java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOut
> >>>>>>>>>> pu
> >>>>>>>>>> tH
> >>>>>>>>>>>>> andleRunner.java:53)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecuto
> >>>>>>>>>> rI
> >>>>>>>>>> mp
> >>>>>>>>>>>>> l$1.run(DefaultExecutorFactory.java:66)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
> >>>>>>>>>> av
> >>>>>>>>>> a:
> >>>>>>>>>>>>> 1145)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> >>>>>>>>>> ja
> >>>>>>>>>> va
> >>>>>>>>>>>>> :615)
> >>>>>>>>>>>>>     at java.lang.Thread.run(Thread.java:744)
> >>>>>>>>>>>>> Process 'Gradle Worker 13' finished with non-zero exit
> >>>> value 1
> >>>>>>>>>>>>> org.gradle.process.internal.ExecException: Process 'Gradle
> >>>>>>> Worker
> >>>>>>>>> 13'
> >>>>>>>>>>>>> finished with non-zero exit value 1
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNor
> >>>>>>>>>> ma
> >>>>>>>>>> lE
> >>>>>>>>>>>>> xitValue(DefaultExecHandle.java:362)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(Default
> >>>>>>>>>> Wo
> >>>>>>>>>> rk
> >>>>>>>>>>>>> erProcess.java:89)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWor
> >>>>>>>>>> ke
> >>>>>>>>>> rP
> >>>>>>>>>>>>> rocess.java:33)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(D
> >>>>>>>>>> ef
> >>>>>>>>>> au
> >>>>>>>>>>>>> ltWorkerProcess.java:55)
> >>>>>>>>>>>>>     at
> >>>> sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> >>>>>>>>> Method)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
> >>>>>>>>>> av
> >>>>>>>>>> a:
> >>>>>>>>>>>>> 57)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> >>>>>>>>>> or
> >>>>>>>>>> Im
> >>>>>>>>>>>>> pl.java:43)
> >>>>>>>>>>>>>     at java.lang.reflect.Method.invoke(Method.java:606)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDi
> >>>>>>>>>> sp
> >>>>>>>>>> at
> >>>>>>>>>>>>> ch.java:35)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDi
> >>>>>>>>>> sp
> >>>>>>>>>> at
> >>>>>>>>>>>>> ch.java:24)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:
> >>>>>>>>>> 81
> >>>>>>>>>> )
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:
> >>>>>>>>>> 30
> >>>>>>>>>> )
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocati
> >>>>>>>>>> on
> >>>>>>>>>> Ha
> >>>>>>>>>>>>> ndler.invoke(ProxyDispatchAdapter.java:93)
> >>>>>>>>>>>>>     at com.sun.proxy.$Proxy46.executionFinished(Unknown
> >>>>>>>>> Source)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultE
> >>>>>>>>>> xe
> >>>>>>>>>> cH
> >>>>>>>>>>>>> andle.java:212)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHand
> >>>>>>>>>> le
> >>>>>>>>>> .j
> >>>>>>>>>>>>> ava:309)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunne
> >>>>>>>>>> r.
> >>>>>>>>>> ja
> >>>>>>>>>>>>> va:108)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java
> >>>>>>>>>> :8
> >>>>>>>>>> 8)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecuto
> >>>>>>>>>> rI
> >>>>>>>>>> mp
> >>>>>>>>>>>>> l$1.run(DefaultExecutorFactory.java:66)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
> >>>>>>>>>> av
> >>>>>>>>>> a:
> >>>>>>>>>>>>> 1145)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> >>>>>>>>>> ja
> >>>>>>>>>> va
> >>>>>>>>>>>>> :615)
> >>>>>>>>>>>>>     at java.lang.Thread.run(Thread.java:744)
> >>>>>>>>>>>>> OpenJDK 64-Bit Server VM warning: INFO:
> >>>>>>>>>>>>> os::commit_memory(0x000000070a6c0000, 3946053632, 0)
> >>> failed;
> >>>>>>>>>>>>> error='Cannot allocate memory' (errno=12)
> >>>>>>>>>>>>> #
> >>>>>>>>>>>>> # There is insufficient memory for the Java Runtime
> >>>>>>> Environment
> >>>>>>>>> to
> >>>>>>>>>>>>> continue.
> >>>>>>>>>>>>> # Native memory allocation (malloc) failed to allocate
> >>>>>>> 3946053632
> >>>>>>>>>> bytes
> >>>>>>>>>>>>> for committing reserved memory.
> >>>>>>>>>>>>> # An error report file with more information is saved as:
> >>>>>>>>>>>>> #
> >>>> /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2518.log
> >>>>>>>>>>>>> Could not write standard input into: Gradle Worker 14.
> >>>>>>>>>>>>> java.io.IOException: Broken pipe
> >>>>>>>>>>>>>     at java.io.FileOutputStream.writeBytes(Native
> >>> Method)
> >>>>>>>>>>>>>     at
> >>>>>>>>> java.io.FileOutputStream.write(FileOutputStream.java:345)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>
> >>>>>
> >>>>>
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>> java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOut
> >>>>>>>>>> pu
> >>>>>>>>>> tH
> >>>>>>>>>>>>> andleRunner.java:53)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecuto
> >>>>>>>>>> rI
> >>>>>>>>>> mp
> >>>>>>>>>>>>> l$1.run(DefaultExecutorFactory.java:66)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
> >>>>>>>>>> av
> >>>>>>>>>> a:
> >>>>>>>>>>>>> 1145)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> >>>>>>>>>> ja
> >>>>>>>>>> va
> >>>>>>>>>>>>> :615)
> >>>>>>>>>>>>>     at java.lang.Thread.run(Thread.java:744)
> >>>>>>>>>>>>> Process 'Gradle Worker 14' finished with non-zero exit
> >>>> value 1
> >>>>>>>>>>>>> org.gradle.process.internal.ExecException: Process 'Gradle
> >>>>>>> Worker
> >>>>>>>>> 14'
> >>>>>>>>>>>>> finished with non-zero exit value 1
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNor
> >>>>>>>>>> ma
> >>>>>>>>>> lE
> >>>>>>>>>>>>> xitValue(DefaultExecHandle.java:362)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(Default
> >>>>>>>>>> Wo
> >>>>>>>>>> rk
> >>>>>>>>>>>>> erProcess.java:89)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWor
> >>>>>>>>>> ke
> >>>>>>>>>> rP
> >>>>>>>>>>>>> rocess.java:33)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(D
> >>>>>>>>>> ef
> >>>>>>>>>> au
> >>>>>>>>>>>>> ltWorkerProcess.java:55)
> >>>>>>>>>>>>>     at
> >>>> sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> >>>>>>>>> Method)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
> >>>>>>>>>> av
> >>>>>>>>>> a:
> >>>>>>>>>>>>> 57)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> >>>>>>>>>> or
> >>>>>>>>>> Im
> >>>>>>>>>>>>> pl.java:43)
> >>>>>>>>>>>>>     at java.lang.reflect.Method.invoke(Method.java:606)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDi
> >>>>>>>>>> sp
> >>>>>>>>>> at
> >>>>>>>>>>>>> ch.java:35)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDi
> >>>>>>>>>> sp
> >>>>>>>>>> at
> >>>>>>>>>>>>> ch.java:24)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:
> >>>>>>>>>> 81
> >>>>>>>>>> )
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:
> >>>>>>>>>> 30
> >>>>>>>>>> )
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocati
> >>>>>>>>>> on
> >>>>>>>>>> Ha
> >>>>>>>>>>>>> ndler.invoke(ProxyDispatchAdapter.java:93)
> >>>>>>>>>>>>>     at com.sun.proxy.$Proxy46.executionFinished(Unknown
> >>>>>>>>> Source)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultE
> >>>>>>>>>> xe
> >>>>>>>>>> cH
> >>>>>>>>>>>>> andle.java:212)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHand
> >>>>>>>>>> le
> >>>>>>>>>> .j
> >>>>>>>>>>>>> ava:309)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunne
> >>>>>>>>>> r.
> >>>>>>>>>> ja
> >>>>>>>>>>>>> va:108)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java
> >>>>>>>>>> :8
> >>>>>>>>>> 8)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecuto
> >>>>>>>>>> rI
> >>>>>>>>>> mp
> >>>>>>>>>>>>> l$1.run(DefaultExecutorFactory.java:66)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
> >>>>>>>>>> av
> >>>>>>>>>> a:
> >>>>>>>>>>>>> 1145)
> >>>>>>>>>>>>>     at
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> >>>>>>>>>> ja
> >>>>>>>>>> va
> >>>>>>>>>>>>> :615)
> >>>>>>>>>>>>>     at java.lang.Thread.r
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Do I need more memory for my machines? Each already has
> >>>> 4GB. I
> >>>>>>>>> really
> >>>>>>>>>>>>> need to have this running. I¹m not sure which way is best
> >>>>>>> http or
> >>>>>>>>> hdfs
> >>>>>>>>>>>>> which one you suggest and how can i solve my problem for
> >>>> each
> >>>>>>>>> case.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks in advance and sorry for bothering this much.
> >>>>>>>>>>>>> On 10 Aug 2014, at 00:20, Telles Nobrega
> >>>>>>>>> <te...@gmail.com>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Hi Chris, now I have the tar file in my RM machine, and
> >>>> the
> >>>>>>>>> yarn
> >>>>>>>>>> path
> >>>>>>>>>>>>>> points to it. I changed the core-site.xml to use
> >>>>>>> HttpFileSystem
> >>>>>>>>>> instead
> >>>>>>>>>>>>>> of HDFS now it is failing with
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Application application_1407640485281_0001 failed 2
> >>> times
> >>>>>>> due
> >>>>>>>>> to
> >>>>>>>>> AM
> >>>>>>>>>>>>>> Container for appattempt_1407640485281_0001_000002 exited
> >>>>>>> with
> >>>>>>>>>>>>>> exitCode:-1000 due to: java.lang.ClassNotFoundException:
> >>>>>>> Class
> >>>>>>>>>>>>>> org.apache.samza.util.hadoop.HttpFileSystem not found
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I think I can solve this just installing scala files
> >>> from
> >>>>>>> the
> >>>>>>>>> samza
> >>>>>>>>>>>>>> tutorial, can you confirm that?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On 09 Aug 2014, at 08:34, Telles Nobrega
> >>>>>>>>> <tellesnobrega@gmail.com
> >>>>>>>>>>
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Hi Chris,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I think the problem is that I forgot to update the
> >>>>>>>>>> yarn.job.package.
> >>>>>>>>>>>>>>> I will try again to see if it works now.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I have one more question, how can I stop (command line)
> >>>> the
> >>>>>>>>> jobs
> >>>>>>>>>>>>>>> running in my topology, for the experiment that I will
> >>>> run,
> >>>>>>> I
> >>>>>>>>> need
> >>>>>>>>>> to
> >>>>>>>>>>>>>>> run the same job in 4 minutes intervals. So I need to
> >>> kill
> >>>>>>> it,
> >>>>>>>>> clean
> >>>>>>>>>>>>>>> the kafka topics and rerun.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thanks in advance.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On 08 Aug 2014, at 12:41, Chris Riccomini
> >>>>>>>>>>>>>>> <cr...@linkedin.com.INVALID> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Hey Telles,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Do I need to have the job folder on each machine in
> >>> my
> >>>>>>>>> cluster?
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> No, you should not need to do this. There are two ways
> >>>> to
> >>>>>>>>> deploy
> >>>>>>>>>>> your
> >>>>>>>>>>>>>>>> tarball to the YARN grid. One is to put it in HDFS,
> >>> and
> >>>>>>> the
> >>>>>>>>> other
> >>>>>>>>>> is
> >>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>> put it on an HTTP server. The link to running a Samza
> >>>> job
> >>>>>>> in
> >>>>>>>>> a
> >>>>>>>>>>>>>>>> multi-node
> >>>>>>>>>>>>>>>> YARN cluster describes how to do both (either HTTP
> >>>> server
> >>>>>>> or
> >>>>>>>>>> HDFS).
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> In both cases, once the tarball is put in on the
> >>>> HTTP/HDFS
> >>>>>>>>>>> server(s),
> >>>>>>>>>>>>>>>> you
> >>>>>>>>>>>>>>>> must update yarn.package.path to point to it. From
> >>>> there,
> >>>>>>> the
> >>>>>>>>> YARN
> >>>>>>>>>>> NM
> >>>>>>>>>>>>>>>> should download it for you automatically when you
> >>> start
> >>>>>>> your
> >>>>>>>>> job.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> * Can you send along a paste of your job config?
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>>>> Chris
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On 8/8/14 8:04 AM, "Claudio Martins"
> >>>>>>>>> <cl...@mobileaware.com>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Hi Telles, it looks to me that you forgot to update
> >>> the
> >>>>>>>>>>>>>>>>> "yarn.package.path"
> >>>>>>>>>>>>>>>>> attribute in your config file for the task.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> - Claudio Martins
> >>>>>>>>>>>>>>>>> Head of Engineering
> >>>>>>>>>>>>>>>>> MobileAware USA Inc. / www.mobileaware.com
> >>>>>>>>>>>>>>>>> office: +1 617 986 5060 / mobile: +1 617 480 5288
> >>>>>>>>>>>>>>>>> linkedin: www.linkedin.com/in/martinsclaudio
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On Fri, Aug 8, 2014 at 10:55 AM, Telles Nobrega
> >>>>>>>>>>>>>>>>> <te...@gmail.com>
> >>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> this is my first time trying to run a job on a
> >>>> multinode
> >>>>>>>>>>>>>>>>>> environment. I
> >>>>>>>>>>>>>>>>>> have the cluster set up, I can see in the GUI that
> >>> all
> >>>>>>>>> nodes
> >>>>>>>>> are
> >>>>>>>>>>>>>>>>>> working.
> >>>>>>>>>>>>>>>>>> Do I need to have the job folder on each machine in
> >>> my
> >>>>>>>>> cluster?
> >>>>>>>>>>>>>>>>>> - The first time I tried running with the job on the
> >>>>>>>>> namenode
> >>>>>>>>>>>>>>>>>> machine
> >>>>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>> it failed saying:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Application application_1407509228798_0001 failed 2
> >>>>>>> times
> >>>>>>>>> due
> >>>>>>>>> to
> >>>>>>>>>>> AM
> >>>>>>>>>>>>>>>>>> Container for appattempt_1407509228798_0001_000002
> >>>>>>> exited
> >>>>>>>>> with
> >>>>>>>>>>>>>>>>>> exitCode:
> >>>>>>>>>>>>>>>>>> -1000 due to: File
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>>>>>>
> file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-
> >>>>>>>>>>>>>>> pa
> >>>>>>>>>>>>>>> ck
> >>>>>>>>>>>>>>>>>> age-
> >>>>>>>>>>>>>>>>>> 0.7.0-dist.tar.gz
> >>>>>>>>>>>>>>>>>> does not exist
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> So I copied the folder to each machine in my cluster
> >>>> and
> >>>>>>>>> got
> >>>>>>>>>> this
> >>>>>>>>>>>>>>>>>> error:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Application application_1407509228798_0002 failed 2
> >>>>>>> times
> >>>>>>>>> due
> >>>>>>>>> to
> >>>>>>>>>>> AM
> >>>>>>>>>>>>>>>>>> Container for appattempt_1407509228798_0002_000002
> >>>>>>> exited
> >>>>>>>>> with
> >>>>>>>>>>>>>>>>>> exitCode:
> >>>>>>>>>>>>>>>>>> -1000 due to: Resource
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>>>>>>>>>>
> file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-
> >>>>>>>>>>>>>>> pa
> >>>>>>>>>>>>>>> ck
> >>>>>>>>>>>>>>>>>> age-
> >>>>>>>>>>>>>>>>>> 0.7.0-dist.tar.gz
> >>>>>>>>>>>>>>>>>> changed on src filesystem (expected 1407509168000,
> >>> was
> >>>>>>>>>>> 1407509434000
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> What am I missing?
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> p.s.: I followed this
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> <
> >>>>>>>>>>>>
> >>>>>>>>>
> >>>> https://github.com/yahoo/samoa/wiki/Executing-SAMOA-with-Apache-Samz
> >>>>>>>>>>>>>>>>>> a>
> >>>>>>>>>>>>>>>>>> tutorial
> >>>>>>>>>>>>>>>>>> and this
> >>>>>>>>>>>>>>>>>> <
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>
> >>>>>
> http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-
> >>>>>>>>>>>>>>>>>> node
> >>>>>>>>>>>>>>>>>> -yarn.html
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>> set up the cluster.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Help is much appreciated.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Thanks in advance.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>> ------------------------------------------
> >>>>>>>>>>>>>>>>>> Telles Mota Vidal Nobrega
> >>>>>>>>>>>>>>>>>> M.sc. Candidate at UFCG
> >>>>>>>>>>>>>>>>>> B.sc. in Computer Science at UFCG
> >>>>>>>>>>>>>>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>> ------------------------------------------
> >>>>>>>>>> Telles Mota Vidal Nobrega
> >>>>>>>>>> M.sc. Candidate at UFCG
> >>>>>>>>>> B.sc. in Computer Science at UFCG
> >>>>>>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> ------------------------------------------
> >>>>>>>> Telles Mota Vidal Nobrega
> >>>>>>>> M.sc. Candidate at UFCG
> >>>>>>>> B.sc. in Computer Science at UFCG
> >>>>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> ------------------------------------------
> >>>>>> Telles Mota Vidal Nobrega
> >>>>>> M.sc. Candidate at UFCG
> >>>>>> B.sc. in Computer Science at UFCG
> >>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
> >>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> ------------------------------------------
> >>> Telles Mota Vidal Nobrega
> >>> M.sc. Candidate at UFCG
> >>> B.sc. in Computer Science at UFCG
> >>> Software Engineer at OpenStack Project - HP/LSD-UFCG
> >>>
> >
>
>

Re: Running Job on Multinode Yarn Cluster

Posted by Telles Nobrega <te...@gmail.com>.
You may forget this last email, I was really stupid and put the files in a different folder. Now it could find the file but it’s not there yet… another error came up

Exception in thread "main" java.util.ServiceConfigurationError: org.apache.hadoop.fs.FileSystem: Provider org.apache.hadoop.hdfs.DistributedFileSystem could not be instantiated
	at java.util.ServiceLoader.fail(ServiceLoader.java:224)
	at java.util.ServiceLoader.access$100(ServiceLoader.java:181)
	at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:377)
	at java.util.ServiceLoader$1.next(ServiceLoader.java:445)
	at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2400)
	at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2411)
	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428)
	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
	at org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.scala:111)
	at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
	at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
	at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
	at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
	at org.apache.samza.job.JobRunner.main(JobRunner.scala)
Caused by: java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration$DeprecationDelta
	at org.apache.hadoop.hdfs.HdfsConfiguration.addDeprecatedKeys(HdfsConfiguration.java:66)
	at org.apache.hadoop.hdfs.HdfsConfiguration.<clinit>(HdfsConfiguration.java:31)
	at org.apache.hadoop.hdfs.DistributedFileSystem.<clinit>(DistributedFileSystem.java:106)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
	at java.lang.Class.newInstance(Class.java:374)
	at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:373)
	... 15 more
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.conf.Configuration$DeprecationDelta
	at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
	... 24 more

On 11 Aug 2014, at 20:45, Telles Nobrega <te...@gmail.com> wrote:

> Hi, I copied hadoop-hdfs-2.3.0 to my-job/lib and it changed the error which is good but the error is back to 
> 
> Exception in thread "main" java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.hdfs.DistributedFileSystem not found
> 	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1720)
> 	at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2415)
> 	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428)
> 	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
> 	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
> 	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
> 	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
> 	at org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.scala:111)
> 	at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
> 	at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
> 	at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
> 	at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
> 	at org.apache.samza.job.JobRunner.main(JobRunner.scala)
> Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.hdfs.DistributedFileSystem not found
> 	at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1626)
> 	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1718)
> 	... 13 more
> 
> Do I need to have this lib in all nodes at the job folder or just to submit?
> 
> On 11 Aug 2014, at 20:11, Yan Fang <ya...@gmail.com> wrote:
> 
>> Hi Telles,
>> 
>> I replayed your problem and think I figured out why CLASSPATH does not
>> work. Because in our script bin/run-class.sh, we have the line
>> "CLASSPATH=$HADOOP_CONF_DIR", which actually ingores your setting.
>> 
>> So a simple solution is to copy the hadoop-hdfs.jar to your samza lib
>> directory. Then run bin/run-job ----config-factory=... --config-path=... .
>> Let me know how it goes. Thank you.
>> 
>> Cheers,
>> 
>> Fang, Yan
>> yanfang724@gmail.com
>> +1 (206) 849-4108
>> 
>> 
>> On Mon, Aug 11, 2014 at 4:07 PM, Telles Nobrega <te...@gmail.com>
>> wrote:
>> 
>>> Sure, thanks.
>>> 
>>> 
>>> On Mon, Aug 11, 2014 at 6:22 PM, Yan Fang <ya...@gmail.com> wrote:
>>> 
>>>> Hi Telles,
>>>> 
>>>> I am not sure whether exporting the CLASSPATH works. (sometimes it does
>>> not
>>>> work for me...) My suggestion is to include the hdfs jar explicitly in
>>> the
>>>> package that you upload to hdfs. Also , remember to put the jar into your
>>>> local samza (which is deploy/samza/lib if you go with the hello-samza
>>>> tutorial) Let me know if that works.
>>>> 
>>>> Cheers,
>>>> 
>>>> Fang, Yan
>>>> yanfang724@gmail.com
>>>> +1 (206) 849-4108
>>>> 
>>>> 
>>>> On Mon, Aug 11, 2014 at 2:04 PM, Chris Riccomini <
>>>> criccomini@linkedin.com.invalid> wrote:
>>>> 
>>>>> Hey Telles,
>>>>> 
>>>>> Hmm. I'm out of ideas. If Zhijie is around, he'd probably be of use,
>>> but
>>>> I
>>>>> haven't heard from him in a while.
>>>>> 
>>>>> I'm afraid your best bet is probably to email the YARN dev mailing
>>> list,
>>>>> since this is a YARN config issue.
>>>>> 
>>>>> Cheers,
>>>>> Chris
>>>>> 
>>>>> On 8/11/14 1:58 PM, "Telles Nobrega" <te...@gmail.com> wrote:
>>>>> 
>>>>>> ​I exported ​export
>>>>> 
>>>> 
>>>> CLASSPATH=$CLASSPATH:hadoop-2.3.0/share/hadoop/hdfs/hadoop-hdfs-2.3.0.jar
>>>>>> and still happened the same problem.
>>>>>> 
>>>>>> 
>>>>>> On Mon, Aug 11, 2014 at 5:35 PM, Chris Riccomini <
>>>>>> criccomini@linkedin.com.invalid> wrote:
>>>>>> 
>>>>>>> Hey Telles,
>>>>>>> 
>>>>>>> It sounds like either the HDFS jar is missing from the classpath, or
>>>> the
>>>>>>> hdfs file system needs to be configured:
>>>>>>> 
>>>>>>> <property>
>>>>>>> <name>fs.hdfs.impl</name>
>>>>>>> <value>org.apache.hadoop.hdfs.DistributedFileSystem</value>
>>>>>>> <description>The FileSystem for hdfs: uris.</description>
>>>>>>> </property>
>>>>>>> 
>>>>>>> 
>>>>>>> (from
>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>> https://groups.google.com/a/cloudera.org/forum/#!topic/scm-users/lyho8ptA
>>>>>>> zE
>>>>>>> 0)
>>>>>>> 
>>>>>>> I believe this will need to be configured for your NM.
>>>>>>> 
>>>>>>> Cheers,
>>>>>>> Chris
>>>>>>> 
>>>>>>> On 8/11/14 1:31 PM, "Telles Nobrega" <te...@gmail.com>
>>> wrote:
>>>>>>> 
>>>>>>>> Yes, it is like this:
>>>>>>>> 
>>>>>>>> <configuration>
>>>>>>>> <property>
>>>>>>>>  <name>dfs.datanode.data.dir</name>
>>>>>>>>  <value>file:///home/ubuntu/hadoop-2.3.0/hdfs/datanode</value>
>>>>>>>>  <description>Comma separated list of paths on the local
>>>> filesystem
>>>>>>> of
>>>>>>>> a
>>>>>>>> DataNode where it should store its blocks.</description>
>>>>>>>> </property>
>>>>>>>> 
>>>>>>>> <property>
>>>>>>>>  <name>dfs.namenode.name.dir</name>
>>>>>>>>  <value>file:///home/ubuntu/hadoop-2.3.0/hdfs/namenode</value>
>>>>>>>>  <description>Path on the local filesystem where the NameNode
>>>> stores
>>>>>>>> the
>>>>>>>> namespace and transaction logs persistently.</description>
>>>>>>>> </property>
>>>>>>>> </configuration>
>>>>>>>> ~
>>>>>>>> 
>>>>>>>> I saw some report that this may be a classpath problem. Does this
>>>>>>> sounds
>>>>>>>> right to you?
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Mon, Aug 11, 2014 at 5:25 PM, Yan Fang <ya...@gmail.com>
>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Hi Telles,
>>>>>>>>> 
>>>>>>>>> It looks correct. Did you put the hdfs-site.xml into your
>>>>>>>>> HADOOP_CONF_DIR
>>>>>>>>> ?(such as ~/.samza/conf)
>>>>>>>>> 
>>>>>>>>> Fang, Yan
>>>>>>>>> yanfang724@gmail.com
>>>>>>>>> +1 (206) 849-4108
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Mon, Aug 11, 2014 at 1:02 PM, Telles Nobrega
>>>>>>>>> <te...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> ​Hi Yan Fang,
>>>>>>>>>> 
>>>>>>>>>> I was able to deploy the file to hdfs, I can see them in all my
>>>>>>> nodes
>>>>>>>>> but
>>>>>>>>>> when I tried running I got this error:
>>>>>>>>>> 
>>>>>>>>>> Exception in thread "main" java.io.IOException: No FileSystem
>>> for
>>>>>>>>> scheme:
>>>>>>>>>> hdfs
>>>>>>>>>> at
>>>>>>>>> 
>>>>> 
>>>> 
>>>>> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2421)
>>>>>>>>>> at
>>>>>>>>> 
>>>>> 
>>>>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428)
>>>>>>>>>> at
>>> org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
>>>>>>>>>> at
>>>>>>>>> 
>>>>> 
>>>>> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
>>>>>>>>>> at
>>>> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
>>>>>>>>>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
>>>>>>>>>> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
>>>>>>>>>> at
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>> org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.s
>>>>>>>>> ca
>>>>>>>>> la:111)
>>>>>>>>>> at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
>>>>>>>>>> at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
>>>>>>>>>> at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
>>>>>>>>>> at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
>>>>>>>>>> at org.apache.samza.job.JobRunner.main(JobRunner.scala)
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> This is my yarn.package.path config:
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>> ​yarn.package.path=hdfs://telles-master-samza:50070/samza-job-package-0
>>>>>>>>> .7
>>>>>>>>> .0-dist.tar.gz
>>>>>>>>>> 
>>>>>>>>>> Thanks in advance
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Mon, Aug 11, 2014 at 3:00 PM, Yan Fang <
>>> yanfang724@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Hi Telles,
>>>>>>>>>>> 
>>>>>>>>>>> In terms of "*I tried pushing the tar file to HDFS but I got
>>> an
>>>>>>>>> error
>>>>>>>>>> from
>>>>>>>>>>> hadoop saying that it couldn’t find core-site.xml file*.", I
>>>>>>> guess
>>>>>>>>> you
>>>>>>>>>> set
>>>>>>>>>>> the HADOOP_CONF_DIR variable and made it point to
>>>> ~/.samza/conf.
>>>>>>> You
>>>>>>>>> can
>>>>>>>>>> do
>>>>>>>>>>> 1) make the HADOOP_CONF_DIR point to the directory where your
>>>>>>> conf
>>>>>>>>> files
>>>>>>>>>>> are, such as /etc/hadoop/conf. Or 2) copy the config files to
>>>>>>>>>>> ~/.samza/conf. Thank you,
>>>>>>>>>>> 
>>>>>>>>>>> Cheer,
>>>>>>>>>>> 
>>>>>>>>>>> Fang, Yan
>>>>>>>>>>> yanfang724@gmail.com
>>>>>>>>>>> +1 (206) 849-4108
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Mon, Aug 11, 2014 at 7:40 AM, Chris Riccomini <
>>>>>>>>>>> criccomini@linkedin.com.invalid> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> Hey Telles,
>>>>>>>>>>>> 
>>>>>>>>>>>> To get YARN working with the HTTP file system, you need to
>>>>>>> follow
>>>>>>>>> the
>>>>>>>>>>>> instructions on:
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>> http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-node
>>>>>>>>> -y
>>>>>>>>>>>> arn.html
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> In the "Set Up Http Filesystem for YARN" section.
>>>>>>>>>>>> 
>>>>>>>>>>>> You shouldn't need to compile anything (no Gradle, which is
>>>>>>> what
>>>>>>>>> your
>>>>>>>>>>>> stack trace is showing). This setup should be done for all
>>> of
>>>>>>> the
>>>>>>>>> NMs,
>>>>>>>>>>>> since they will be the ones downloading your job's package
>>>>>>> (from
>>>>>>>>>>>> yarn.package.path).
>>>>>>>>>>>> 
>>>>>>>>>>>> Cheers,
>>>>>>>>>>>> Chris
>>>>>>>>>>>> 
>>>>>>>>>>>> On 8/9/14 9:44 PM, "Telles Nobrega" <
>>> tellesnobrega@gmail.com
>>>>> 
>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi again, I tried installing the scala libs but the Http
>>>>>>> problem
>>>>>>>>> still
>>>>>>>>>>>>> occurs. I realised that I need to compile incubator samza
>>> in
>>>>>>> the
>>>>>>>>>>> machines
>>>>>>>>>>>>> that I¹m going to run the jobs, but the compilation fails
>>>> with
>>>>>>>>> this
>>>>>>>>>> huge
>>>>>>>>>>>>> message:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> #
>>>>>>>>>>>>> # There is insufficient memory for the Java Runtime
>>>>>>> Environment
>>>>>>>>> to
>>>>>>>>>>>>> continue.
>>>>>>>>>>>>> # Native memory allocation (malloc) failed to allocate
>>>>>>> 3946053632
>>>>>>>>>> bytes
>>>>>>>>>>>>> for committing reserved memory.
>>>>>>>>>>>>> # An error report file with more information is saved as:
>>>>>>>>>>>>> #
>>>> /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2506.log
>>>>>>>>>>>>> Could not write standard input into: Gradle Worker 13.
>>>>>>>>>>>>> java.io.IOException: Broken pipe
>>>>>>>>>>>>>     at java.io.FileOutputStream.writeBytes(Native
>>> Method)
>>>>>>>>>>>>>     at
>>>>>>>>> java.io.FileOutputStream.write(FileOutputStream.java:345)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>> 
>>>>> 
>>>>> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>> java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOut
>>>>>>>>>> pu
>>>>>>>>>> tH
>>>>>>>>>>>>> andleRunner.java:53)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecuto
>>>>>>>>>> rI
>>>>>>>>>> mp
>>>>>>>>>>>>> l$1.run(DefaultExecutorFactory.java:66)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
>>>>>>>>>> av
>>>>>>>>>> a:
>>>>>>>>>>>>> 1145)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
>>>>>>>>>> ja
>>>>>>>>>> va
>>>>>>>>>>>>> :615)
>>>>>>>>>>>>>     at java.lang.Thread.run(Thread.java:744)
>>>>>>>>>>>>> Process 'Gradle Worker 13' finished with non-zero exit
>>>> value 1
>>>>>>>>>>>>> org.gradle.process.internal.ExecException: Process 'Gradle
>>>>>>> Worker
>>>>>>>>> 13'
>>>>>>>>>>>>> finished with non-zero exit value 1
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNor
>>>>>>>>>> ma
>>>>>>>>>> lE
>>>>>>>>>>>>> xitValue(DefaultExecHandle.java:362)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(Default
>>>>>>>>>> Wo
>>>>>>>>>> rk
>>>>>>>>>>>>> erProcess.java:89)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWor
>>>>>>>>>> ke
>>>>>>>>>> rP
>>>>>>>>>>>>> rocess.java:33)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(D
>>>>>>>>>> ef
>>>>>>>>>> au
>>>>>>>>>>>>> ltWorkerProcess.java:55)
>>>>>>>>>>>>>     at
>>>> sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>>>>>>>>> Method)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
>>>>>>>>>> av
>>>>>>>>>> a:
>>>>>>>>>>>>> 57)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
>>>>>>>>>> or
>>>>>>>>>> Im
>>>>>>>>>>>>> pl.java:43)
>>>>>>>>>>>>>     at java.lang.reflect.Method.invoke(Method.java:606)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDi
>>>>>>>>>> sp
>>>>>>>>>> at
>>>>>>>>>>>>> ch.java:35)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDi
>>>>>>>>>> sp
>>>>>>>>>> at
>>>>>>>>>>>>> ch.java:24)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:
>>>>>>>>>> 81
>>>>>>>>>> )
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:
>>>>>>>>>> 30
>>>>>>>>>> )
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocati
>>>>>>>>>> on
>>>>>>>>>> Ha
>>>>>>>>>>>>> ndler.invoke(ProxyDispatchAdapter.java:93)
>>>>>>>>>>>>>     at com.sun.proxy.$Proxy46.executionFinished(Unknown
>>>>>>>>> Source)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultE
>>>>>>>>>> xe
>>>>>>>>>> cH
>>>>>>>>>>>>> andle.java:212)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHand
>>>>>>>>>> le
>>>>>>>>>> .j
>>>>>>>>>>>>> ava:309)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunne
>>>>>>>>>> r.
>>>>>>>>>> ja
>>>>>>>>>>>>> va:108)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java
>>>>>>>>>> :8
>>>>>>>>>> 8)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecuto
>>>>>>>>>> rI
>>>>>>>>>> mp
>>>>>>>>>>>>> l$1.run(DefaultExecutorFactory.java:66)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
>>>>>>>>>> av
>>>>>>>>>> a:
>>>>>>>>>>>>> 1145)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
>>>>>>>>>> ja
>>>>>>>>>> va
>>>>>>>>>>>>> :615)
>>>>>>>>>>>>>     at java.lang.Thread.run(Thread.java:744)
>>>>>>>>>>>>> OpenJDK 64-Bit Server VM warning: INFO:
>>>>>>>>>>>>> os::commit_memory(0x000000070a6c0000, 3946053632, 0)
>>> failed;
>>>>>>>>>>>>> error='Cannot allocate memory' (errno=12)
>>>>>>>>>>>>> #
>>>>>>>>>>>>> # There is insufficient memory for the Java Runtime
>>>>>>> Environment
>>>>>>>>> to
>>>>>>>>>>>>> continue.
>>>>>>>>>>>>> # Native memory allocation (malloc) failed to allocate
>>>>>>> 3946053632
>>>>>>>>>> bytes
>>>>>>>>>>>>> for committing reserved memory.
>>>>>>>>>>>>> # An error report file with more information is saved as:
>>>>>>>>>>>>> #
>>>> /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2518.log
>>>>>>>>>>>>> Could not write standard input into: Gradle Worker 14.
>>>>>>>>>>>>> java.io.IOException: Broken pipe
>>>>>>>>>>>>>     at java.io.FileOutputStream.writeBytes(Native
>>> Method)
>>>>>>>>>>>>>     at
>>>>>>>>> java.io.FileOutputStream.write(FileOutputStream.java:345)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>> 
>>>>> 
>>>>> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>> java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOut
>>>>>>>>>> pu
>>>>>>>>>> tH
>>>>>>>>>>>>> andleRunner.java:53)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecuto
>>>>>>>>>> rI
>>>>>>>>>> mp
>>>>>>>>>>>>> l$1.run(DefaultExecutorFactory.java:66)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
>>>>>>>>>> av
>>>>>>>>>> a:
>>>>>>>>>>>>> 1145)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
>>>>>>>>>> ja
>>>>>>>>>> va
>>>>>>>>>>>>> :615)
>>>>>>>>>>>>>     at java.lang.Thread.run(Thread.java:744)
>>>>>>>>>>>>> Process 'Gradle Worker 14' finished with non-zero exit
>>>> value 1
>>>>>>>>>>>>> org.gradle.process.internal.ExecException: Process 'Gradle
>>>>>>> Worker
>>>>>>>>> 14'
>>>>>>>>>>>>> finished with non-zero exit value 1
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNor
>>>>>>>>>> ma
>>>>>>>>>> lE
>>>>>>>>>>>>> xitValue(DefaultExecHandle.java:362)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(Default
>>>>>>>>>> Wo
>>>>>>>>>> rk
>>>>>>>>>>>>> erProcess.java:89)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWor
>>>>>>>>>> ke
>>>>>>>>>> rP
>>>>>>>>>>>>> rocess.java:33)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(D
>>>>>>>>>> ef
>>>>>>>>>> au
>>>>>>>>>>>>> ltWorkerProcess.java:55)
>>>>>>>>>>>>>     at
>>>> sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>>>>>>>>> Method)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
>>>>>>>>>> av
>>>>>>>>>> a:
>>>>>>>>>>>>> 57)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
>>>>>>>>>> or
>>>>>>>>>> Im
>>>>>>>>>>>>> pl.java:43)
>>>>>>>>>>>>>     at java.lang.reflect.Method.invoke(Method.java:606)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDi
>>>>>>>>>> sp
>>>>>>>>>> at
>>>>>>>>>>>>> ch.java:35)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDi
>>>>>>>>>> sp
>>>>>>>>>> at
>>>>>>>>>>>>> ch.java:24)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:
>>>>>>>>>> 81
>>>>>>>>>> )
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:
>>>>>>>>>> 30
>>>>>>>>>> )
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocati
>>>>>>>>>> on
>>>>>>>>>> Ha
>>>>>>>>>>>>> ndler.invoke(ProxyDispatchAdapter.java:93)
>>>>>>>>>>>>>     at com.sun.proxy.$Proxy46.executionFinished(Unknown
>>>>>>>>> Source)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultE
>>>>>>>>>> xe
>>>>>>>>>> cH
>>>>>>>>>>>>> andle.java:212)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHand
>>>>>>>>>> le
>>>>>>>>>> .j
>>>>>>>>>>>>> ava:309)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunne
>>>>>>>>>> r.
>>>>>>>>>> ja
>>>>>>>>>>>>> va:108)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java
>>>>>>>>>> :8
>>>>>>>>>> 8)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecuto
>>>>>>>>>> rI
>>>>>>>>>> mp
>>>>>>>>>>>>> l$1.run(DefaultExecutorFactory.java:66)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
>>>>>>>>>> av
>>>>>>>>>> a:
>>>>>>>>>>>>> 1145)
>>>>>>>>>>>>>     at
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
>>>>>>>>>> ja
>>>>>>>>>> va
>>>>>>>>>>>>> :615)
>>>>>>>>>>>>>     at java.lang.Thread.r
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Do I need more memory for my machines? Each already has
>>>> 4GB. I
>>>>>>>>> really
>>>>>>>>>>>>> need to have this running. I¹m not sure which way is best
>>>>>>> http or
>>>>>>>>> hdfs
>>>>>>>>>>>>> which one you suggest and how can i solve my problem for
>>>> each
>>>>>>>>> case.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks in advance and sorry for bothering this much.
>>>>>>>>>>>>> On 10 Aug 2014, at 00:20, Telles Nobrega
>>>>>>>>> <te...@gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hi Chris, now I have the tar file in my RM machine, and
>>>> the
>>>>>>>>> yarn
>>>>>>>>>> path
>>>>>>>>>>>>>> points to it. I changed the core-site.xml to use
>>>>>>> HttpFileSystem
>>>>>>>>>> instead
>>>>>>>>>>>>>> of HDFS now it is failing with
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Application application_1407640485281_0001 failed 2
>>> times
>>>>>>> due
>>>>>>>>> to
>>>>>>>>> AM
>>>>>>>>>>>>>> Container for appattempt_1407640485281_0001_000002 exited
>>>>>>> with
>>>>>>>>>>>>>> exitCode:-1000 due to: java.lang.ClassNotFoundException:
>>>>>>> Class
>>>>>>>>>>>>>> org.apache.samza.util.hadoop.HttpFileSystem not found
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I think I can solve this just installing scala files
>>> from
>>>>>>> the
>>>>>>>>> samza
>>>>>>>>>>>>>> tutorial, can you confirm that?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On 09 Aug 2014, at 08:34, Telles Nobrega
>>>>>>>>> <tellesnobrega@gmail.com
>>>>>>>>>> 
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Hi Chris,
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I think the problem is that I forgot to update the
>>>>>>>>>> yarn.job.package.
>>>>>>>>>>>>>>> I will try again to see if it works now.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I have one more question, how can I stop (command line)
>>>> the
>>>>>>>>> jobs
>>>>>>>>>>>>>>> running in my topology, for the experiment that I will
>>>> run,
>>>>>>> I
>>>>>>>>> need
>>>>>>>>>> to
>>>>>>>>>>>>>>> run the same job in 4 minutes intervals. So I need to
>>> kill
>>>>>>> it,
>>>>>>>>> clean
>>>>>>>>>>>>>>> the kafka topics and rerun.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Thanks in advance.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On 08 Aug 2014, at 12:41, Chris Riccomini
>>>>>>>>>>>>>>> <cr...@linkedin.com.INVALID> wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Hey Telles,
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Do I need to have the job folder on each machine in
>>> my
>>>>>>>>> cluster?
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> No, you should not need to do this. There are two ways
>>>> to
>>>>>>>>> deploy
>>>>>>>>>>> your
>>>>>>>>>>>>>>>> tarball to the YARN grid. One is to put it in HDFS,
>>> and
>>>>>>> the
>>>>>>>>> other
>>>>>>>>>> is
>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>> put it on an HTTP server. The link to running a Samza
>>>> job
>>>>>>> in
>>>>>>>>> a
>>>>>>>>>>>>>>>> multi-node
>>>>>>>>>>>>>>>> YARN cluster describes how to do both (either HTTP
>>>> server
>>>>>>> or
>>>>>>>>>> HDFS).
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> In both cases, once the tarball is put in on the
>>>> HTTP/HDFS
>>>>>>>>>>> server(s),
>>>>>>>>>>>>>>>> you
>>>>>>>>>>>>>>>> must update yarn.package.path to point to it. From
>>>> there,
>>>>>>> the
>>>>>>>>> YARN
>>>>>>>>>>> NM
>>>>>>>>>>>>>>>> should download it for you automatically when you
>>> start
>>>>>>> your
>>>>>>>>> job.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> * Can you send along a paste of your job config?
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On 8/8/14 8:04 AM, "Claudio Martins"
>>>>>>>>> <cl...@mobileaware.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Hi Telles, it looks to me that you forgot to update
>>> the
>>>>>>>>>>>>>>>>> "yarn.package.path"
>>>>>>>>>>>>>>>>> attribute in your config file for the task.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> - Claudio Martins
>>>>>>>>>>>>>>>>> Head of Engineering
>>>>>>>>>>>>>>>>> MobileAware USA Inc. / www.mobileaware.com
>>>>>>>>>>>>>>>>> office: +1 617 986 5060 / mobile: +1 617 480 5288
>>>>>>>>>>>>>>>>> linkedin: www.linkedin.com/in/martinsclaudio
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> On Fri, Aug 8, 2014 at 10:55 AM, Telles Nobrega
>>>>>>>>>>>>>>>>> <te...@gmail.com>
>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> this is my first time trying to run a job on a
>>>> multinode
>>>>>>>>>>>>>>>>>> environment. I
>>>>>>>>>>>>>>>>>> have the cluster set up, I can see in the GUI that
>>> all
>>>>>>>>> nodes
>>>>>>>>> are
>>>>>>>>>>>>>>>>>> working.
>>>>>>>>>>>>>>>>>> Do I need to have the job folder on each machine in
>>> my
>>>>>>>>> cluster?
>>>>>>>>>>>>>>>>>> - The first time I tried running with the job on the
>>>>>>>>> namenode
>>>>>>>>>>>>>>>>>> machine
>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>> it failed saying:
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Application application_1407509228798_0001 failed 2
>>>>>>> times
>>>>>>>>> due
>>>>>>>>> to
>>>>>>>>>>> AM
>>>>>>>>>>>>>>>>>> Container for appattempt_1407509228798_0001_000002
>>>>>>> exited
>>>>>>>>> with
>>>>>>>>>>>>>>>>>> exitCode:
>>>>>>>>>>>>>>>>>> -1000 due to: File
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>>>>>>> file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-
>>>>>>>>>>>>>>> pa
>>>>>>>>>>>>>>> ck
>>>>>>>>>>>>>>>>>> age-
>>>>>>>>>>>>>>>>>> 0.7.0-dist.tar.gz
>>>>>>>>>>>>>>>>>> does not exist
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> So I copied the folder to each machine in my cluster
>>>> and
>>>>>>>>> got
>>>>>>>>>> this
>>>>>>>>>>>>>>>>>> error:
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Application application_1407509228798_0002 failed 2
>>>>>>> times
>>>>>>>>> due
>>>>>>>>> to
>>>>>>>>>>> AM
>>>>>>>>>>>>>>>>>> Container for appattempt_1407509228798_0002_000002
>>>>>>> exited
>>>>>>>>> with
>>>>>>>>>>>>>>>>>> exitCode:
>>>>>>>>>>>>>>>>>> -1000 due to: Resource
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>>>>>>>>>>> file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-
>>>>>>>>>>>>>>> pa
>>>>>>>>>>>>>>> ck
>>>>>>>>>>>>>>>>>> age-
>>>>>>>>>>>>>>>>>> 0.7.0-dist.tar.gz
>>>>>>>>>>>>>>>>>> changed on src filesystem (expected 1407509168000,
>>> was
>>>>>>>>>>> 1407509434000
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> What am I missing?
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> p.s.: I followed this
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> <
>>>>>>>>>>>> 
>>>>>>>>> 
>>>> https://github.com/yahoo/samoa/wiki/Executing-SAMOA-with-Apache-Samz
>>>>>>>>>>>>>>>>>> a>
>>>>>>>>>>>>>>>>>> tutorial
>>>>>>>>>>>>>>>>>> and this
>>>>>>>>>>>>>>>>>> <
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>> 
>>>>> http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-
>>>>>>>>>>>>>>>>>> node
>>>>>>>>>>>>>>>>>> -yarn.html
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>> set up the cluster.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Help is much appreciated.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Thanks in advance.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>> ------------------------------------------
>>>>>>>>>>>>>>>>>> Telles Mota Vidal Nobrega
>>>>>>>>>>>>>>>>>> M.sc. Candidate at UFCG
>>>>>>>>>>>>>>>>>> B.sc. in Computer Science at UFCG
>>>>>>>>>>>>>>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> --
>>>>>>>>>> ------------------------------------------
>>>>>>>>>> Telles Mota Vidal Nobrega
>>>>>>>>>> M.sc. Candidate at UFCG
>>>>>>>>>> B.sc. in Computer Science at UFCG
>>>>>>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> ------------------------------------------
>>>>>>>> Telles Mota Vidal Nobrega
>>>>>>>> M.sc. Candidate at UFCG
>>>>>>>> B.sc. in Computer Science at UFCG
>>>>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> ------------------------------------------
>>>>>> Telles Mota Vidal Nobrega
>>>>>> M.sc. Candidate at UFCG
>>>>>> B.sc. in Computer Science at UFCG
>>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
>>>>> 
>>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> ------------------------------------------
>>> Telles Mota Vidal Nobrega
>>> M.sc. Candidate at UFCG
>>> B.sc. in Computer Science at UFCG
>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
>>> 
> 


Re: Running Job on Multinode Yarn Cluster

Posted by Telles Nobrega <te...@gmail.com>.
Hi, I copied hadoop-hdfs-2.3.0 to my-job/lib and it changed the error which is good but the error is back to 

Exception in thread "main" java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.hdfs.DistributedFileSystem not found
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1720)
	at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2415)
	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428)
	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
	at org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.scala:111)
	at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
	at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
	at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
	at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
	at org.apache.samza.job.JobRunner.main(JobRunner.scala)
Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.hdfs.DistributedFileSystem not found
	at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1626)
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1718)
	... 13 more

Do I need to have this lib in all nodes at the job folder or just to submit?

On 11 Aug 2014, at 20:11, Yan Fang <ya...@gmail.com> wrote:

> Hi Telles,
> 
> I replayed your problem and think I figured out why CLASSPATH does not
> work. Because in our script bin/run-class.sh, we have the line
> "CLASSPATH=$HADOOP_CONF_DIR", which actually ingores your setting.
> 
> So a simple solution is to copy the hadoop-hdfs.jar to your samza lib
> directory. Then run bin/run-job ----config-factory=... --config-path=... .
> Let me know how it goes. Thank you.
> 
> Cheers,
> 
> Fang, Yan
> yanfang724@gmail.com
> +1 (206) 849-4108
> 
> 
> On Mon, Aug 11, 2014 at 4:07 PM, Telles Nobrega <te...@gmail.com>
> wrote:
> 
>> Sure, thanks.
>> 
>> 
>> On Mon, Aug 11, 2014 at 6:22 PM, Yan Fang <ya...@gmail.com> wrote:
>> 
>>> Hi Telles,
>>> 
>>> I am not sure whether exporting the CLASSPATH works. (sometimes it does
>> not
>>> work for me...) My suggestion is to include the hdfs jar explicitly in
>> the
>>> package that you upload to hdfs. Also , remember to put the jar into your
>>> local samza (which is deploy/samza/lib if you go with the hello-samza
>>> tutorial) Let me know if that works.
>>> 
>>> Cheers,
>>> 
>>> Fang, Yan
>>> yanfang724@gmail.com
>>> +1 (206) 849-4108
>>> 
>>> 
>>> On Mon, Aug 11, 2014 at 2:04 PM, Chris Riccomini <
>>> criccomini@linkedin.com.invalid> wrote:
>>> 
>>>> Hey Telles,
>>>> 
>>>> Hmm. I'm out of ideas. If Zhijie is around, he'd probably be of use,
>> but
>>> I
>>>> haven't heard from him in a while.
>>>> 
>>>> I'm afraid your best bet is probably to email the YARN dev mailing
>> list,
>>>> since this is a YARN config issue.
>>>> 
>>>> Cheers,
>>>> Chris
>>>> 
>>>> On 8/11/14 1:58 PM, "Telles Nobrega" <te...@gmail.com> wrote:
>>>> 
>>>>> ​I exported ​export
>>>> 
>>> 
>>> CLASSPATH=$CLASSPATH:hadoop-2.3.0/share/hadoop/hdfs/hadoop-hdfs-2.3.0.jar
>>>>> and still happened the same problem.
>>>>> 
>>>>> 
>>>>> On Mon, Aug 11, 2014 at 5:35 PM, Chris Riccomini <
>>>>> criccomini@linkedin.com.invalid> wrote:
>>>>> 
>>>>>> Hey Telles,
>>>>>> 
>>>>>> It sounds like either the HDFS jar is missing from the classpath, or
>>> the
>>>>>> hdfs file system needs to be configured:
>>>>>> 
>>>>>> <property>
>>>>>>  <name>fs.hdfs.impl</name>
>>>>>>  <value>org.apache.hadoop.hdfs.DistributedFileSystem</value>
>>>>>>  <description>The FileSystem for hdfs: uris.</description>
>>>>>> </property>
>>>>>> 
>>>>>> 
>>>>>> (from
>>>>>> 
>>>>>> 
>>>> 
>>> 
>> https://groups.google.com/a/cloudera.org/forum/#!topic/scm-users/lyho8ptA
>>>>>> zE
>>>>>> 0)
>>>>>> 
>>>>>> I believe this will need to be configured for your NM.
>>>>>> 
>>>>>> Cheers,
>>>>>> Chris
>>>>>> 
>>>>>> On 8/11/14 1:31 PM, "Telles Nobrega" <te...@gmail.com>
>> wrote:
>>>>>> 
>>>>>>> Yes, it is like this:
>>>>>>> 
>>>>>>> <configuration>
>>>>>>> <property>
>>>>>>>   <name>dfs.datanode.data.dir</name>
>>>>>>>   <value>file:///home/ubuntu/hadoop-2.3.0/hdfs/datanode</value>
>>>>>>>   <description>Comma separated list of paths on the local
>>> filesystem
>>>>>> of
>>>>>>> a
>>>>>>> DataNode where it should store its blocks.</description>
>>>>>>> </property>
>>>>>>> 
>>>>>>> <property>
>>>>>>>   <name>dfs.namenode.name.dir</name>
>>>>>>>   <value>file:///home/ubuntu/hadoop-2.3.0/hdfs/namenode</value>
>>>>>>>   <description>Path on the local filesystem where the NameNode
>>> stores
>>>>>>> the
>>>>>>> namespace and transaction logs persistently.</description>
>>>>>>> </property>
>>>>>>> </configuration>
>>>>>>> ~
>>>>>>> 
>>>>>>> I saw some report that this may be a classpath problem. Does this
>>>>>> sounds
>>>>>>> right to you?
>>>>>>> 
>>>>>>> 
>>>>>>> On Mon, Aug 11, 2014 at 5:25 PM, Yan Fang <ya...@gmail.com>
>>>> wrote:
>>>>>>> 
>>>>>>>> Hi Telles,
>>>>>>>> 
>>>>>>>> It looks correct. Did you put the hdfs-site.xml into your
>>>>>>>> HADOOP_CONF_DIR
>>>>>>>> ?(such as ~/.samza/conf)
>>>>>>>> 
>>>>>>>> Fang, Yan
>>>>>>>> yanfang724@gmail.com
>>>>>>>> +1 (206) 849-4108
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Mon, Aug 11, 2014 at 1:02 PM, Telles Nobrega
>>>>>>>> <te...@gmail.com>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> ​Hi Yan Fang,
>>>>>>>>> 
>>>>>>>>> I was able to deploy the file to hdfs, I can see them in all my
>>>>>> nodes
>>>>>>>> but
>>>>>>>>> when I tried running I got this error:
>>>>>>>>> 
>>>>>>>>> Exception in thread "main" java.io.IOException: No FileSystem
>> for
>>>>>>>> scheme:
>>>>>>>>> hdfs
>>>>>>>>> at
>>>>>>>> 
>>>> 
>>> 
>>>> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2421)
>>>>>>>>> at
>>>>>>>> 
>>>> 
>>>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428)
>>>>>>>>> at
>> org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
>>>>>>>>> at
>>>>>>>> 
>>>> 
>>>> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
>>>>>>>>> at
>>> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
>>>>>>>>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
>>>>>>>>> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>> org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.s
>>>>>>>> ca
>>>>>>>> la:111)
>>>>>>>>> at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
>>>>>>>>> at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
>>>>>>>>> at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
>>>>>>>>> at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
>>>>>>>>> at org.apache.samza.job.JobRunner.main(JobRunner.scala)
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> This is my yarn.package.path config:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>> ​yarn.package.path=hdfs://telles-master-samza:50070/samza-job-package-0
>>>>>>>> .7
>>>>>>>> .0-dist.tar.gz
>>>>>>>>> 
>>>>>>>>> Thanks in advance
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Mon, Aug 11, 2014 at 3:00 PM, Yan Fang <
>> yanfang724@gmail.com>
>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Hi Telles,
>>>>>>>>>> 
>>>>>>>>>> In terms of "*I tried pushing the tar file to HDFS but I got
>> an
>>>>>>>> error
>>>>>>>>> from
>>>>>>>>>> hadoop saying that it couldn’t find core-site.xml file*.", I
>>>>>> guess
>>>>>>>> you
>>>>>>>>> set
>>>>>>>>>> the HADOOP_CONF_DIR variable and made it point to
>>> ~/.samza/conf.
>>>>>> You
>>>>>>>> can
>>>>>>>>> do
>>>>>>>>>> 1) make the HADOOP_CONF_DIR point to the directory where your
>>>>>> conf
>>>>>>>> files
>>>>>>>>>> are, such as /etc/hadoop/conf. Or 2) copy the config files to
>>>>>>>>>> ~/.samza/conf. Thank you,
>>>>>>>>>> 
>>>>>>>>>> Cheer,
>>>>>>>>>> 
>>>>>>>>>> Fang, Yan
>>>>>>>>>> yanfang724@gmail.com
>>>>>>>>>> +1 (206) 849-4108
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Mon, Aug 11, 2014 at 7:40 AM, Chris Riccomini <
>>>>>>>>>> criccomini@linkedin.com.invalid> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Hey Telles,
>>>>>>>>>>> 
>>>>>>>>>>> To get YARN working with the HTTP file system, you need to
>>>>>> follow
>>>>>>>> the
>>>>>>>>>>> instructions on:
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>> 
>> http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-node
>>>>>>>> -y
>>>>>>>>>>> arn.html
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> In the "Set Up Http Filesystem for YARN" section.
>>>>>>>>>>> 
>>>>>>>>>>> You shouldn't need to compile anything (no Gradle, which is
>>>>>> what
>>>>>>>> your
>>>>>>>>>>> stack trace is showing). This setup should be done for all
>> of
>>>>>> the
>>>>>>>> NMs,
>>>>>>>>>>> since they will be the ones downloading your job's package
>>>>>> (from
>>>>>>>>>>> yarn.package.path).
>>>>>>>>>>> 
>>>>>>>>>>> Cheers,
>>>>>>>>>>> Chris
>>>>>>>>>>> 
>>>>>>>>>>> On 8/9/14 9:44 PM, "Telles Nobrega" <
>> tellesnobrega@gmail.com
>>>> 
>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> Hi again, I tried installing the scala libs but the Http
>>>>>> problem
>>>>>>>> still
>>>>>>>>>>>> occurs. I realised that I need to compile incubator samza
>> in
>>>>>> the
>>>>>>>>>> machines
>>>>>>>>>>>> that I¹m going to run the jobs, but the compilation fails
>>> with
>>>>>>>> this
>>>>>>>>> huge
>>>>>>>>>>>> message:
>>>>>>>>>>>> 
>>>>>>>>>>>> #
>>>>>>>>>>>> # There is insufficient memory for the Java Runtime
>>>>>> Environment
>>>>>>>> to
>>>>>>>>>>>> continue.
>>>>>>>>>>>> # Native memory allocation (malloc) failed to allocate
>>>>>> 3946053632
>>>>>>>>> bytes
>>>>>>>>>>>> for committing reserved memory.
>>>>>>>>>>>> # An error report file with more information is saved as:
>>>>>>>>>>>> #
>>> /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2506.log
>>>>>>>>>>>> Could not write standard input into: Gradle Worker 13.
>>>>>>>>>>>> java.io.IOException: Broken pipe
>>>>>>>>>>>>      at java.io.FileOutputStream.writeBytes(Native
>> Method)
>>>>>>>>>>>>      at
>>>>>>>> java.io.FileOutputStream.write(FileOutputStream.java:345)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>> 
>>>> 
>>>> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>> java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOut
>>>>>>>>> pu
>>>>>>>>> tH
>>>>>>>>>>>> andleRunner.java:53)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecuto
>>>>>>>>> rI
>>>>>>>>> mp
>>>>>>>>>>>> l$1.run(DefaultExecutorFactory.java:66)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
>>>>>>>>> av
>>>>>>>>> a:
>>>>>>>>>>>> 1145)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
>>>>>>>>> ja
>>>>>>>>> va
>>>>>>>>>>>> :615)
>>>>>>>>>>>>      at java.lang.Thread.run(Thread.java:744)
>>>>>>>>>>>> Process 'Gradle Worker 13' finished with non-zero exit
>>> value 1
>>>>>>>>>>>> org.gradle.process.internal.ExecException: Process 'Gradle
>>>>>> Worker
>>>>>>>> 13'
>>>>>>>>>>>> finished with non-zero exit value 1
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNor
>>>>>>>>> ma
>>>>>>>>> lE
>>>>>>>>>>>> xitValue(DefaultExecHandle.java:362)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(Default
>>>>>>>>> Wo
>>>>>>>>> rk
>>>>>>>>>>>> erProcess.java:89)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWor
>>>>>>>>> ke
>>>>>>>>> rP
>>>>>>>>>>>> rocess.java:33)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(D
>>>>>>>>> ef
>>>>>>>>> au
>>>>>>>>>>>> ltWorkerProcess.java:55)
>>>>>>>>>>>>      at
>>> sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>>>>>>>> Method)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
>>>>>>>>> av
>>>>>>>>> a:
>>>>>>>>>>>> 57)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
>>>>>>>>> or
>>>>>>>>> Im
>>>>>>>>>>>> pl.java:43)
>>>>>>>>>>>>      at java.lang.reflect.Method.invoke(Method.java:606)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDi
>>>>>>>>> sp
>>>>>>>>> at
>>>>>>>>>>>> ch.java:35)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDi
>>>>>>>>> sp
>>>>>>>>> at
>>>>>>>>>>>> ch.java:24)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:
>>>>>>>>> 81
>>>>>>>>> )
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:
>>>>>>>>> 30
>>>>>>>>> )
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocati
>>>>>>>>> on
>>>>>>>>> Ha
>>>>>>>>>>>> ndler.invoke(ProxyDispatchAdapter.java:93)
>>>>>>>>>>>>      at com.sun.proxy.$Proxy46.executionFinished(Unknown
>>>>>>>> Source)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultE
>>>>>>>>> xe
>>>>>>>>> cH
>>>>>>>>>>>> andle.java:212)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHand
>>>>>>>>> le
>>>>>>>>> .j
>>>>>>>>>>>> ava:309)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunne
>>>>>>>>> r.
>>>>>>>>> ja
>>>>>>>>>>>> va:108)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java
>>>>>>>>> :8
>>>>>>>>> 8)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecuto
>>>>>>>>> rI
>>>>>>>>> mp
>>>>>>>>>>>> l$1.run(DefaultExecutorFactory.java:66)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
>>>>>>>>> av
>>>>>>>>> a:
>>>>>>>>>>>> 1145)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
>>>>>>>>> ja
>>>>>>>>> va
>>>>>>>>>>>> :615)
>>>>>>>>>>>>      at java.lang.Thread.run(Thread.java:744)
>>>>>>>>>>>> OpenJDK 64-Bit Server VM warning: INFO:
>>>>>>>>>>>> os::commit_memory(0x000000070a6c0000, 3946053632, 0)
>> failed;
>>>>>>>>>>>> error='Cannot allocate memory' (errno=12)
>>>>>>>>>>>> #
>>>>>>>>>>>> # There is insufficient memory for the Java Runtime
>>>>>> Environment
>>>>>>>> to
>>>>>>>>>>>> continue.
>>>>>>>>>>>> # Native memory allocation (malloc) failed to allocate
>>>>>> 3946053632
>>>>>>>>> bytes
>>>>>>>>>>>> for committing reserved memory.
>>>>>>>>>>>> # An error report file with more information is saved as:
>>>>>>>>>>>> #
>>> /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2518.log
>>>>>>>>>>>> Could not write standard input into: Gradle Worker 14.
>>>>>>>>>>>> java.io.IOException: Broken pipe
>>>>>>>>>>>>      at java.io.FileOutputStream.writeBytes(Native
>> Method)
>>>>>>>>>>>>      at
>>>>>>>> java.io.FileOutputStream.write(FileOutputStream.java:345)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>> 
>>>> 
>>>> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>> java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOut
>>>>>>>>> pu
>>>>>>>>> tH
>>>>>>>>>>>> andleRunner.java:53)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecuto
>>>>>>>>> rI
>>>>>>>>> mp
>>>>>>>>>>>> l$1.run(DefaultExecutorFactory.java:66)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
>>>>>>>>> av
>>>>>>>>> a:
>>>>>>>>>>>> 1145)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
>>>>>>>>> ja
>>>>>>>>> va
>>>>>>>>>>>> :615)
>>>>>>>>>>>>      at java.lang.Thread.run(Thread.java:744)
>>>>>>>>>>>> Process 'Gradle Worker 14' finished with non-zero exit
>>> value 1
>>>>>>>>>>>> org.gradle.process.internal.ExecException: Process 'Gradle
>>>>>> Worker
>>>>>>>> 14'
>>>>>>>>>>>> finished with non-zero exit value 1
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNor
>>>>>>>>> ma
>>>>>>>>> lE
>>>>>>>>>>>> xitValue(DefaultExecHandle.java:362)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(Default
>>>>>>>>> Wo
>>>>>>>>> rk
>>>>>>>>>>>> erProcess.java:89)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWor
>>>>>>>>> ke
>>>>>>>>> rP
>>>>>>>>>>>> rocess.java:33)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(D
>>>>>>>>> ef
>>>>>>>>> au
>>>>>>>>>>>> ltWorkerProcess.java:55)
>>>>>>>>>>>>      at
>>> sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>>>>>>>> Method)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
>>>>>>>>> av
>>>>>>>>> a:
>>>>>>>>>>>> 57)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
>>>>>>>>> or
>>>>>>>>> Im
>>>>>>>>>>>> pl.java:43)
>>>>>>>>>>>>      at java.lang.reflect.Method.invoke(Method.java:606)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDi
>>>>>>>>> sp
>>>>>>>>> at
>>>>>>>>>>>> ch.java:35)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDi
>>>>>>>>> sp
>>>>>>>>> at
>>>>>>>>>>>> ch.java:24)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:
>>>>>>>>> 81
>>>>>>>>> )
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:
>>>>>>>>> 30
>>>>>>>>> )
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocati
>>>>>>>>> on
>>>>>>>>> Ha
>>>>>>>>>>>> ndler.invoke(ProxyDispatchAdapter.java:93)
>>>>>>>>>>>>      at com.sun.proxy.$Proxy46.executionFinished(Unknown
>>>>>>>> Source)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultE
>>>>>>>>> xe
>>>>>>>>> cH
>>>>>>>>>>>> andle.java:212)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHand
>>>>>>>>> le
>>>>>>>>> .j
>>>>>>>>>>>> ava:309)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunne
>>>>>>>>> r.
>>>>>>>>> ja
>>>>>>>>>>>> va:108)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java
>>>>>>>>> :8
>>>>>>>>> 8)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecuto
>>>>>>>>> rI
>>>>>>>>> mp
>>>>>>>>>>>> l$1.run(DefaultExecutorFactory.java:66)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
>>>>>>>>> av
>>>>>>>>> a:
>>>>>>>>>>>> 1145)
>>>>>>>>>>>>      at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
>>>>>>>>> ja
>>>>>>>>> va
>>>>>>>>>>>> :615)
>>>>>>>>>>>>      at java.lang.Thread.r
>>>>>>>>>>>> 
>>>>>>>>>>>> Do I need more memory for my machines? Each already has
>>> 4GB. I
>>>>>>>> really
>>>>>>>>>>>> need to have this running. I¹m not sure which way is best
>>>>>> http or
>>>>>>>> hdfs
>>>>>>>>>>>> which one you suggest and how can i solve my problem for
>>> each
>>>>>>>> case.
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks in advance and sorry for bothering this much.
>>>>>>>>>>>> On 10 Aug 2014, at 00:20, Telles Nobrega
>>>>>>>> <te...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi Chris, now I have the tar file in my RM machine, and
>>> the
>>>>>>>> yarn
>>>>>>>>> path
>>>>>>>>>>>>> points to it. I changed the core-site.xml to use
>>>>>> HttpFileSystem
>>>>>>>>> instead
>>>>>>>>>>>>> of HDFS now it is failing with
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Application application_1407640485281_0001 failed 2
>> times
>>>>>> due
>>>>>>>> to
>>>>>>>> AM
>>>>>>>>>>>>> Container for appattempt_1407640485281_0001_000002 exited
>>>>>> with
>>>>>>>>>>>>> exitCode:-1000 due to: java.lang.ClassNotFoundException:
>>>>>> Class
>>>>>>>>>>>>> org.apache.samza.util.hadoop.HttpFileSystem not found
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I think I can solve this just installing scala files
>> from
>>>>>> the
>>>>>>>> samza
>>>>>>>>>>>>> tutorial, can you confirm that?
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On 09 Aug 2014, at 08:34, Telles Nobrega
>>>>>>>> <tellesnobrega@gmail.com
>>>>>>>>> 
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hi Chris,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I think the problem is that I forgot to update the
>>>>>>>>> yarn.job.package.
>>>>>>>>>>>>>> I will try again to see if it works now.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I have one more question, how can I stop (command line)
>>> the
>>>>>>>> jobs
>>>>>>>>>>>>>> running in my topology, for the experiment that I will
>>> run,
>>>>>> I
>>>>>>>> need
>>>>>>>>> to
>>>>>>>>>>>>>> run the same job in 4 minutes intervals. So I need to
>> kill
>>>>>> it,
>>>>>>>> clean
>>>>>>>>>>>>>> the kafka topics and rerun.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks in advance.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On 08 Aug 2014, at 12:41, Chris Riccomini
>>>>>>>>>>>>>> <cr...@linkedin.com.INVALID> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Hey Telles,
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Do I need to have the job folder on each machine in
>> my
>>>>>>>> cluster?
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> No, you should not need to do this. There are two ways
>>> to
>>>>>>>> deploy
>>>>>>>>>> your
>>>>>>>>>>>>>>> tarball to the YARN grid. One is to put it in HDFS,
>> and
>>>>>> the
>>>>>>>> other
>>>>>>>>> is
>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>> put it on an HTTP server. The link to running a Samza
>>> job
>>>>>> in
>>>>>>>> a
>>>>>>>>>>>>>>> multi-node
>>>>>>>>>>>>>>> YARN cluster describes how to do both (either HTTP
>>> server
>>>>>> or
>>>>>>>>> HDFS).
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> In both cases, once the tarball is put in on the
>>> HTTP/HDFS
>>>>>>>>>> server(s),
>>>>>>>>>>>>>>> you
>>>>>>>>>>>>>>> must update yarn.package.path to point to it. From
>>> there,
>>>>>> the
>>>>>>>> YARN
>>>>>>>>>> NM
>>>>>>>>>>>>>>> should download it for you automatically when you
>> start
>>>>>> your
>>>>>>>> job.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> * Can you send along a paste of your job config?
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On 8/8/14 8:04 AM, "Claudio Martins"
>>>>>>>> <cl...@mobileaware.com>
>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Hi Telles, it looks to me that you forgot to update
>> the
>>>>>>>>>>>>>>>> "yarn.package.path"
>>>>>>>>>>>>>>>> attribute in your config file for the task.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> - Claudio Martins
>>>>>>>>>>>>>>>> Head of Engineering
>>>>>>>>>>>>>>>> MobileAware USA Inc. / www.mobileaware.com
>>>>>>>>>>>>>>>> office: +1 617 986 5060 / mobile: +1 617 480 5288
>>>>>>>>>>>>>>>> linkedin: www.linkedin.com/in/martinsclaudio
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On Fri, Aug 8, 2014 at 10:55 AM, Telles Nobrega
>>>>>>>>>>>>>>>> <te...@gmail.com>
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> this is my first time trying to run a job on a
>>> multinode
>>>>>>>>>>>>>>>>> environment. I
>>>>>>>>>>>>>>>>> have the cluster set up, I can see in the GUI that
>> all
>>>>>>>> nodes
>>>>>>>> are
>>>>>>>>>>>>>>>>> working.
>>>>>>>>>>>>>>>>> Do I need to have the job folder on each machine in
>> my
>>>>>>>> cluster?
>>>>>>>>>>>>>>>>> - The first time I tried running with the job on the
>>>>>>>> namenode
>>>>>>>>>>>>>>>>> machine
>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>> it failed saying:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Application application_1407509228798_0001 failed 2
>>>>>> times
>>>>>>>> due
>>>>>>>> to
>>>>>>>>>> AM
>>>>>>>>>>>>>>>>> Container for appattempt_1407509228798_0001_000002
>>>>>> exited
>>>>>>>> with
>>>>>>>>>>>>>>>>> exitCode:
>>>>>>>>>>>>>>>>> -1000 due to: File
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>>>>>>> file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-
>>>>>>>>>>>>>> pa
>>>>>>>>>>>>>> ck
>>>>>>>>>>>>>>>>> age-
>>>>>>>>>>>>>>>>> 0.7.0-dist.tar.gz
>>>>>>>>>>>>>>>>> does not exist
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> So I copied the folder to each machine in my cluster
>>> and
>>>>>>>> got
>>>>>>>>> this
>>>>>>>>>>>>>>>>> error:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Application application_1407509228798_0002 failed 2
>>>>>> times
>>>>>>>> due
>>>>>>>> to
>>>>>>>>>> AM
>>>>>>>>>>>>>>>>> Container for appattempt_1407509228798_0002_000002
>>>>>> exited
>>>>>>>> with
>>>>>>>>>>>>>>>>> exitCode:
>>>>>>>>>>>>>>>>> -1000 due to: Resource
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>>>>>>>>>>> file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-
>>>>>>>>>>>>>> pa
>>>>>>>>>>>>>> ck
>>>>>>>>>>>>>>>>> age-
>>>>>>>>>>>>>>>>> 0.7.0-dist.tar.gz
>>>>>>>>>>>>>>>>> changed on src filesystem (expected 1407509168000,
>> was
>>>>>>>>>> 1407509434000
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> What am I missing?
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> p.s.: I followed this
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> <
>>>>>>>>>>> 
>>>>>>>> 
>>> https://github.com/yahoo/samoa/wiki/Executing-SAMOA-with-Apache-Samz
>>>>>>>>>>>>>>>>> a>
>>>>>>>>>>>>>>>>> tutorial
>>>>>>>>>>>>>>>>> and this
>>>>>>>>>>>>>>>>> <
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>> 
>>>> http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-
>>>>>>>>>>>>>>>>> node
>>>>>>>>>>>>>>>>> -yarn.html
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>> set up the cluster.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Help is much appreciated.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Thanks in advance.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>> ------------------------------------------
>>>>>>>>>>>>>>>>> Telles Mota Vidal Nobrega
>>>>>>>>>>>>>>>>> M.sc. Candidate at UFCG
>>>>>>>>>>>>>>>>> B.sc. in Computer Science at UFCG
>>>>>>>>>>>>>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> ------------------------------------------
>>>>>>>>> Telles Mota Vidal Nobrega
>>>>>>>>> M.sc. Candidate at UFCG
>>>>>>>>> B.sc. in Computer Science at UFCG
>>>>>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> ------------------------------------------
>>>>>>> Telles Mota Vidal Nobrega
>>>>>>> M.sc. Candidate at UFCG
>>>>>>> B.sc. in Computer Science at UFCG
>>>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> ------------------------------------------
>>>>> Telles Mota Vidal Nobrega
>>>>> M.sc. Candidate at UFCG
>>>>> B.sc. in Computer Science at UFCG
>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
>>>> 
>>>> 
>>> 
>> 
>> 
>> 
>> --
>> ------------------------------------------
>> Telles Mota Vidal Nobrega
>> M.sc. Candidate at UFCG
>> B.sc. in Computer Science at UFCG
>> Software Engineer at OpenStack Project - HP/LSD-UFCG
>> 


Re: Running Job on Multinode Yarn Cluster

Posted by Yan Fang <ya...@gmail.com>.
Hi Telles,

I replayed your problem and think I figured out why CLASSPATH does not
work. Because in our script bin/run-class.sh, we have the line
"CLASSPATH=$HADOOP_CONF_DIR", which actually ingores your setting.

So a simple solution is to copy the hadoop-hdfs.jar to your samza lib
directory. Then run bin/run-job ----config-factory=... --config-path=... .
Let me know how it goes. Thank you.

Cheers,

Fang, Yan
yanfang724@gmail.com
+1 (206) 849-4108


On Mon, Aug 11, 2014 at 4:07 PM, Telles Nobrega <te...@gmail.com>
wrote:

> Sure, thanks.
>
>
> On Mon, Aug 11, 2014 at 6:22 PM, Yan Fang <ya...@gmail.com> wrote:
>
> > Hi Telles,
> >
> > I am not sure whether exporting the CLASSPATH works. (sometimes it does
> not
> > work for me...) My suggestion is to include the hdfs jar explicitly in
> the
> > package that you upload to hdfs. Also , remember to put the jar into your
> > local samza (which is deploy/samza/lib if you go with the hello-samza
> > tutorial) Let me know if that works.
> >
> > Cheers,
> >
> > Fang, Yan
> > yanfang724@gmail.com
> > +1 (206) 849-4108
> >
> >
> > On Mon, Aug 11, 2014 at 2:04 PM, Chris Riccomini <
> > criccomini@linkedin.com.invalid> wrote:
> >
> > > Hey Telles,
> > >
> > > Hmm. I'm out of ideas. If Zhijie is around, he'd probably be of use,
> but
> > I
> > > haven't heard from him in a while.
> > >
> > > I'm afraid your best bet is probably to email the YARN dev mailing
> list,
> > > since this is a YARN config issue.
> > >
> > > Cheers,
> > > Chris
> > >
> > > On 8/11/14 1:58 PM, "Telles Nobrega" <te...@gmail.com> wrote:
> > >
> > > >​I exported ​export
> > >
> >
> >CLASSPATH=$CLASSPATH:hadoop-2.3.0/share/hadoop/hdfs/hadoop-hdfs-2.3.0.jar
> > > >and still happened the same problem.
> > > >
> > > >
> > > >On Mon, Aug 11, 2014 at 5:35 PM, Chris Riccomini <
> > > >criccomini@linkedin.com.invalid> wrote:
> > > >
> > > >> Hey Telles,
> > > >>
> > > >> It sounds like either the HDFS jar is missing from the classpath, or
> > the
> > > >> hdfs file system needs to be configured:
> > > >>
> > > >> <property>
> > > >>   <name>fs.hdfs.impl</name>
> > > >>   <value>org.apache.hadoop.hdfs.DistributedFileSystem</value>
> > > >>   <description>The FileSystem for hdfs: uris.</description>
> > > >> </property>
> > > >>
> > > >>
> > > >> (from
> > > >>
> > > >>
> > >
> >
> https://groups.google.com/a/cloudera.org/forum/#!topic/scm-users/lyho8ptA
> > > >>zE
> > > >> 0)
> > > >>
> > > >> I believe this will need to be configured for your NM.
> > > >>
> > > >> Cheers,
> > > >> Chris
> > > >>
> > > >> On 8/11/14 1:31 PM, "Telles Nobrega" <te...@gmail.com>
> wrote:
> > > >>
> > > >> >Yes, it is like this:
> > > >> >
> > > >> ><configuration>
> > > >> >  <property>
> > > >> >    <name>dfs.datanode.data.dir</name>
> > > >> >    <value>file:///home/ubuntu/hadoop-2.3.0/hdfs/datanode</value>
> > > >> >    <description>Comma separated list of paths on the local
> > filesystem
> > > >>of
> > > >> >a
> > > >> >DataNode where it should store its blocks.</description>
> > > >> >  </property>
> > > >> >
> > > >> >  <property>
> > > >> >    <name>dfs.namenode.name.dir</name>
> > > >> >    <value>file:///home/ubuntu/hadoop-2.3.0/hdfs/namenode</value>
> > > >> >    <description>Path on the local filesystem where the NameNode
> > stores
> > > >> >the
> > > >> >namespace and transaction logs persistently.</description>
> > > >> >  </property>
> > > >> ></configuration>
> > > >> >~
> > > >> >
> > > >> >I saw some report that this may be a classpath problem. Does this
> > > >>sounds
> > > >> >right to you?
> > > >> >
> > > >> >
> > > >> >On Mon, Aug 11, 2014 at 5:25 PM, Yan Fang <ya...@gmail.com>
> > > wrote:
> > > >> >
> > > >> >> Hi Telles,
> > > >> >>
> > > >> >> It looks correct. Did you put the hdfs-site.xml into your
> > > >> >>HADOOP_CONF_DIR
> > > >> >> ?(such as ~/.samza/conf)
> > > >> >>
> > > >> >> Fang, Yan
> > > >> >> yanfang724@gmail.com
> > > >> >> +1 (206) 849-4108
> > > >> >>
> > > >> >>
> > > >> >> On Mon, Aug 11, 2014 at 1:02 PM, Telles Nobrega
> > > >> >><te...@gmail.com>
> > > >> >> wrote:
> > > >> >>
> > > >> >> > ​Hi Yan Fang,
> > > >> >> >
> > > >> >> > I was able to deploy the file to hdfs, I can see them in all my
> > > >>nodes
> > > >> >>but
> > > >> >> > when I tried running I got this error:
> > > >> >> >
> > > >> >> > Exception in thread "main" java.io.IOException: No FileSystem
> for
> > > >> >>scheme:
> > > >> >> > hdfs
> > > >> >> > at
> > > >> >>
> > >
> >
> >>org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2421)
> > > >> >> >  at
> > > >> >>
> > >
> >>org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428)
> > > >> >> > at
> org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
> > > >> >> >  at
> > > >> >>
> > >
> >>org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
> > > >> >> > at
> > org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
> > > >> >> >  at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
> > > >> >> > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
> > > >> >> >  at
> > > >> >> >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.s
> > > >>>>ca
> > > >> >>la:111)
> > > >> >> > at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
> > > >> >> >  at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
> > > >> >> > at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
> > > >> >> >  at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
> > > >> >> > at org.apache.samza.job.JobRunner.main(JobRunner.scala)
> > > >> >> >
> > > >> >> >
> > > >> >> > This is my yarn.package.path config:
> > > >> >> >
> > > >> >> >
> > > >> >> >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>​yarn.package.path=hdfs://telles-master-samza:50070/samza-job-package-0
> > > >>>>.7
> > > >> >>.0-dist.tar.gz
> > > >> >> >
> > > >> >> > Thanks in advance
> > > >> >> >
> > > >> >> >
> > > >> >> >
> > > >> >> >
> > > >> >> >
> > > >> >> > On Mon, Aug 11, 2014 at 3:00 PM, Yan Fang <
> yanfang724@gmail.com>
> > > >> >>wrote:
> > > >> >> >
> > > >> >> > > Hi Telles,
> > > >> >> > >
> > > >> >> > > In terms of "*I tried pushing the tar file to HDFS but I got
> an
> > > >> >>error
> > > >> >> > from
> > > >> >> > > hadoop saying that it couldn’t find core-site.xml file*.", I
> > > >>guess
> > > >> >>you
> > > >> >> > set
> > > >> >> > > the HADOOP_CONF_DIR variable and made it point to
> > ~/.samza/conf.
> > > >>You
> > > >> >> can
> > > >> >> > do
> > > >> >> > > 1) make the HADOOP_CONF_DIR point to the directory where your
> > > >>conf
> > > >> >> files
> > > >> >> > > are, such as /etc/hadoop/conf. Or 2) copy the config files to
> > > >> >> > > ~/.samza/conf. Thank you,
> > > >> >> > >
> > > >> >> > > Cheer,
> > > >> >> > >
> > > >> >> > > Fang, Yan
> > > >> >> > > yanfang724@gmail.com
> > > >> >> > > +1 (206) 849-4108
> > > >> >> > >
> > > >> >> > >
> > > >> >> > > On Mon, Aug 11, 2014 at 7:40 AM, Chris Riccomini <
> > > >> >> > > criccomini@linkedin.com.invalid> wrote:
> > > >> >> > >
> > > >> >> > > > Hey Telles,
> > > >> >> > > >
> > > >> >> > > > To get YARN working with the HTTP file system, you need to
> > > >>follow
> > > >> >>the
> > > >> >> > > > instructions on:
> > > >> >> > > >
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >> >>
> > > >>
> > > >>
> > >
> >
> http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-node
> > > >> >>-y
> > > >> >> > > > arn.html
> > > >> >> > > >
> > > >> >> > > >
> > > >> >> > > > In the "Set Up Http Filesystem for YARN" section.
> > > >> >> > > >
> > > >> >> > > > You shouldn't need to compile anything (no Gradle, which is
> > > >>what
> > > >> >>your
> > > >> >> > > > stack trace is showing). This setup should be done for all
> of
> > > >>the
> > > >> >> NMs,
> > > >> >> > > > since they will be the ones downloading your job's package
> > > >>(from
> > > >> >> > > > yarn.package.path).
> > > >> >> > > >
> > > >> >> > > > Cheers,
> > > >> >> > > > Chris
> > > >> >> > > >
> > > >> >> > > > On 8/9/14 9:44 PM, "Telles Nobrega" <
> tellesnobrega@gmail.com
> > >
> > > >> >>wrote:
> > > >> >> > > >
> > > >> >> > > > >Hi again, I tried installing the scala libs but the Http
> > > >>problem
> > > >> >> still
> > > >> >> > > > >occurs. I realised that I need to compile incubator samza
> in
> > > >>the
> > > >> >> > > machines
> > > >> >> > > > >that I¹m going to run the jobs, but the compilation fails
> > with
> > > >> >>this
> > > >> >> > huge
> > > >> >> > > > >message:
> > > >> >> > > > >
> > > >> >> > > > >#
> > > >> >> > > > ># There is insufficient memory for the Java Runtime
> > > >>Environment
> > > >> >>to
> > > >> >> > > > >continue.
> > > >> >> > > > ># Native memory allocation (malloc) failed to allocate
> > > >>3946053632
> > > >> >> > bytes
> > > >> >> > > > >for committing reserved memory.
> > > >> >> > > > ># An error report file with more information is saved as:
> > > >> >> > > > >#
> > /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2506.log
> > > >> >> > > > >Could not write standard input into: Gradle Worker 13.
> > > >> >> > > > >java.io.IOException: Broken pipe
> > > >> >> > > > >       at java.io.FileOutputStream.writeBytes(Native
> Method)
> > > >> >> > > > >       at
> > > >> >>java.io.FileOutputStream.write(FileOutputStream.java:345)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >>
> > >
> >>java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >>java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOut
> > > >>>>>pu
> > > >> >>>tH
> > > >> >> > > > >andleRunner.java:53)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecuto
> > > >>>>>rI
> > > >> >>>mp
> > > >> >> > > > >l$1.run(DefaultExecutorFactory.java:66)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
> > > >>>>>av
> > > >> >>>a:
> > > >> >> > > > >1145)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> > > >>>>>ja
> > > >> >>>va
> > > >> >> > > > >:615)
> > > >> >> > > > >       at java.lang.Thread.run(Thread.java:744)
> > > >> >> > > > >Process 'Gradle Worker 13' finished with non-zero exit
> > value 1
> > > >> >> > > > >org.gradle.process.internal.ExecException: Process 'Gradle
> > > >>Worker
> > > >> >> 13'
> > > >> >> > > > >finished with non-zero exit value 1
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNor
> > > >>>>>ma
> > > >> >>>lE
> > > >> >> > > > >xitValue(DefaultExecHandle.java:362)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(Default
> > > >>>>>Wo
> > > >> >>>rk
> > > >> >> > > > >erProcess.java:89)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWor
> > > >>>>>ke
> > > >> >>>rP
> > > >> >> > > > >rocess.java:33)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(D
> > > >>>>>ef
> > > >> >>>au
> > > >> >> > > > >ltWorkerProcess.java:55)
> > > >> >> > > > >       at
> > sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> > > >> >> Method)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
> > > >>>>>av
> > > >> >>>a:
> > > >> >> > > > >57)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> > > >>>>>or
> > > >> >>>Im
> > > >> >> > > > >pl.java:43)
> > > >> >> > > > >       at java.lang.reflect.Method.invoke(Method.java:606)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDi
> > > >>>>>sp
> > > >> >>>at
> > > >> >> > > > >ch.java:35)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDi
> > > >>>>>sp
> > > >> >>>at
> > > >> >> > > > >ch.java:24)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:
> > > >>>>>81
> > > >> >>>)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:
> > > >>>>>30
> > > >> >>>)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocati
> > > >>>>>on
> > > >> >>>Ha
> > > >> >> > > > >ndler.invoke(ProxyDispatchAdapter.java:93)
> > > >> >> > > > >       at com.sun.proxy.$Proxy46.executionFinished(Unknown
> > > >> >>Source)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultE
> > > >>>>>xe
> > > >> >>>cH
> > > >> >> > > > >andle.java:212)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHand
> > > >>>>>le
> > > >> >>>.j
> > > >> >> > > > >ava:309)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunne
> > > >>>>>r.
> > > >> >>>ja
> > > >> >> > > > >va:108)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java
> > > >>>>>:8
> > > >> >>>8)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecuto
> > > >>>>>rI
> > > >> >>>mp
> > > >> >> > > > >l$1.run(DefaultExecutorFactory.java:66)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
> > > >>>>>av
> > > >> >>>a:
> > > >> >> > > > >1145)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> > > >>>>>ja
> > > >> >>>va
> > > >> >> > > > >:615)
> > > >> >> > > > >       at java.lang.Thread.run(Thread.java:744)
> > > >> >> > > > >OpenJDK 64-Bit Server VM warning: INFO:
> > > >> >> > > > >os::commit_memory(0x000000070a6c0000, 3946053632, 0)
> failed;
> > > >> >> > > > >error='Cannot allocate memory' (errno=12)
> > > >> >> > > > >#
> > > >> >> > > > ># There is insufficient memory for the Java Runtime
> > > >>Environment
> > > >> >>to
> > > >> >> > > > >continue.
> > > >> >> > > > ># Native memory allocation (malloc) failed to allocate
> > > >>3946053632
> > > >> >> > bytes
> > > >> >> > > > >for committing reserved memory.
> > > >> >> > > > ># An error report file with more information is saved as:
> > > >> >> > > > >#
> > /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2518.log
> > > >> >> > > > >Could not write standard input into: Gradle Worker 14.
> > > >> >> > > > >java.io.IOException: Broken pipe
> > > >> >> > > > >       at java.io.FileOutputStream.writeBytes(Native
> Method)
> > > >> >> > > > >       at
> > > >> >>java.io.FileOutputStream.write(FileOutputStream.java:345)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >>
> > >
> >>java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >>java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOut
> > > >>>>>pu
> > > >> >>>tH
> > > >> >> > > > >andleRunner.java:53)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecuto
> > > >>>>>rI
> > > >> >>>mp
> > > >> >> > > > >l$1.run(DefaultExecutorFactory.java:66)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
> > > >>>>>av
> > > >> >>>a:
> > > >> >> > > > >1145)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> > > >>>>>ja
> > > >> >>>va
> > > >> >> > > > >:615)
> > > >> >> > > > >       at java.lang.Thread.run(Thread.java:744)
> > > >> >> > > > >Process 'Gradle Worker 14' finished with non-zero exit
> > value 1
> > > >> >> > > > >org.gradle.process.internal.ExecException: Process 'Gradle
> > > >>Worker
> > > >> >> 14'
> > > >> >> > > > >finished with non-zero exit value 1
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNor
> > > >>>>>ma
> > > >> >>>lE
> > > >> >> > > > >xitValue(DefaultExecHandle.java:362)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(Default
> > > >>>>>Wo
> > > >> >>>rk
> > > >> >> > > > >erProcess.java:89)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWor
> > > >>>>>ke
> > > >> >>>rP
> > > >> >> > > > >rocess.java:33)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(D
> > > >>>>>ef
> > > >> >>>au
> > > >> >> > > > >ltWorkerProcess.java:55)
> > > >> >> > > > >       at
> > sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> > > >> >> Method)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
> > > >>>>>av
> > > >> >>>a:
> > > >> >> > > > >57)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> > > >>>>>or
> > > >> >>>Im
> > > >> >> > > > >pl.java:43)
> > > >> >> > > > >       at java.lang.reflect.Method.invoke(Method.java:606)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDi
> > > >>>>>sp
> > > >> >>>at
> > > >> >> > > > >ch.java:35)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDi
> > > >>>>>sp
> > > >> >>>at
> > > >> >> > > > >ch.java:24)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:
> > > >>>>>81
> > > >> >>>)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:
> > > >>>>>30
> > > >> >>>)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocati
> > > >>>>>on
> > > >> >>>Ha
> > > >> >> > > > >ndler.invoke(ProxyDispatchAdapter.java:93)
> > > >> >> > > > >       at com.sun.proxy.$Proxy46.executionFinished(Unknown
> > > >> >>Source)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultE
> > > >>>>>xe
> > > >> >>>cH
> > > >> >> > > > >andle.java:212)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHand
> > > >>>>>le
> > > >> >>>.j
> > > >> >> > > > >ava:309)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunne
> > > >>>>>r.
> > > >> >>>ja
> > > >> >> > > > >va:108)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java
> > > >>>>>:8
> > > >> >>>8)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecuto
> > > >>>>>rI
> > > >> >>>mp
> > > >> >> > > > >l$1.run(DefaultExecutorFactory.java:66)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
> > > >>>>>av
> > > >> >>>a:
> > > >> >> > > > >1145)
> > > >> >> > > > >       at
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> > > >>>>>ja
> > > >> >>>va
> > > >> >> > > > >:615)
> > > >> >> > > > >       at java.lang.Thread.r
> > > >> >> > > > >
> > > >> >> > > > >Do I need more memory for my machines? Each already has
> > 4GB. I
> > > >> >> really
> > > >> >> > > > >need to have this running. I¹m not sure which way is best
> > > >>http or
> > > >> >> hdfs
> > > >> >> > > > >which one you suggest and how can i solve my problem for
> > each
> > > >> >>case.
> > > >> >> > > > >
> > > >> >> > > > >Thanks in advance and sorry for bothering this much.
> > > >> >> > > > >On 10 Aug 2014, at 00:20, Telles Nobrega
> > > >> >><te...@gmail.com>
> > > >> >> > > wrote:
> > > >> >> > > > >
> > > >> >> > > > >> Hi Chris, now I have the tar file in my RM machine, and
> > the
> > > >> >>yarn
> > > >> >> > path
> > > >> >> > > > >>points to it. I changed the core-site.xml to use
> > > >>HttpFileSystem
> > > >> >> > instead
> > > >> >> > > > >>of HDFS now it is failing with
> > > >> >> > > > >>
> > > >> >> > > > >> Application application_1407640485281_0001 failed 2
> times
> > > >>due
> > > >> >>to
> > > >> >> AM
> > > >> >> > > > >>Container for appattempt_1407640485281_0001_000002 exited
> > > >>with
> > > >> >> > > > >>exitCode:-1000 due to: java.lang.ClassNotFoundException:
> > > >>Class
> > > >> >> > > > >>org.apache.samza.util.hadoop.HttpFileSystem not found
> > > >> >> > > > >>
> > > >> >> > > > >> I think I can solve this just installing scala files
> from
> > > >>the
> > > >> >> samza
> > > >> >> > > > >>tutorial, can you confirm that?
> > > >> >> > > > >>
> > > >> >> > > > >> On 09 Aug 2014, at 08:34, Telles Nobrega
> > > >> >><tellesnobrega@gmail.com
> > > >> >> >
> > > >> >> > > > >>wrote:
> > > >> >> > > > >>
> > > >> >> > > > >>> Hi Chris,
> > > >> >> > > > >>>
> > > >> >> > > > >>> I think the problem is that I forgot to update the
> > > >> >> > yarn.job.package.
> > > >> >> > > > >>> I will try again to see if it works now.
> > > >> >> > > > >>>
> > > >> >> > > > >>> I have one more question, how can I stop (command line)
> > the
> > > >> >>jobs
> > > >> >> > > > >>>running in my topology, for the experiment that I will
> > run,
> > > >>I
> > > >> >>need
> > > >> >> > to
> > > >> >> > > > >>>run the same job in 4 minutes intervals. So I need to
> kill
> > > >>it,
> > > >> >> clean
> > > >> >> > > > >>>the kafka topics and rerun.
> > > >> >> > > > >>>
> > > >> >> > > > >>> Thanks in advance.
> > > >> >> > > > >>>
> > > >> >> > > > >>> On 08 Aug 2014, at 12:41, Chris Riccomini
> > > >> >> > > > >>><cr...@linkedin.com.INVALID> wrote:
> > > >> >> > > > >>>
> > > >> >> > > > >>>> Hey Telles,
> > > >> >> > > > >>>>
> > > >> >> > > > >>>>>> Do I need to have the job folder on each machine in
> my
> > > >> >> cluster?
> > > >> >> > > > >>>>
> > > >> >> > > > >>>> No, you should not need to do this. There are two ways
> > to
> > > >> >>deploy
> > > >> >> > > your
> > > >> >> > > > >>>> tarball to the YARN grid. One is to put it in HDFS,
> and
> > > >>the
> > > >> >> other
> > > >> >> > is
> > > >> >> > > > >>>>to
> > > >> >> > > > >>>> put it on an HTTP server. The link to running a Samza
> > job
> > > >>in
> > > >> >>a
> > > >> >> > > > >>>>multi-node
> > > >> >> > > > >>>> YARN cluster describes how to do both (either HTTP
> > server
> > > >>or
> > > >> >> > HDFS).
> > > >> >> > > > >>>>
> > > >> >> > > > >>>> In both cases, once the tarball is put in on the
> > HTTP/HDFS
> > > >> >> > > server(s),
> > > >> >> > > > >>>>you
> > > >> >> > > > >>>> must update yarn.package.path to point to it. From
> > there,
> > > >>the
> > > >> >> YARN
> > > >> >> > > NM
> > > >> >> > > > >>>> should download it for you automatically when you
> start
> > > >>your
> > > >> >> job.
> > > >> >> > > > >>>>
> > > >> >> > > > >>>> * Can you send along a paste of your job config?
> > > >> >> > > > >>>>
> > > >> >> > > > >>>> Cheers,
> > > >> >> > > > >>>> Chris
> > > >> >> > > > >>>>
> > > >> >> > > > >>>> On 8/8/14 8:04 AM, "Claudio Martins"
> > > >> >><cl...@mobileaware.com>
> > > >> >> > > wrote:
> > > >> >> > > > >>>>
> > > >> >> > > > >>>>> Hi Telles, it looks to me that you forgot to update
> the
> > > >> >> > > > >>>>> "yarn.package.path"
> > > >> >> > > > >>>>> attribute in your config file for the task.
> > > >> >> > > > >>>>>
> > > >> >> > > > >>>>> - Claudio Martins
> > > >> >> > > > >>>>> Head of Engineering
> > > >> >> > > > >>>>> MobileAware USA Inc. / www.mobileaware.com
> > > >> >> > > > >>>>> office: +1 617 986 5060 / mobile: +1 617 480 5288
> > > >> >> > > > >>>>> linkedin: www.linkedin.com/in/martinsclaudio
> > > >> >> > > > >>>>>
> > > >> >> > > > >>>>>
> > > >> >> > > > >>>>> On Fri, Aug 8, 2014 at 10:55 AM, Telles Nobrega
> > > >> >> > > > >>>>><te...@gmail.com>
> > > >> >> > > > >>>>> wrote:
> > > >> >> > > > >>>>>
> > > >> >> > > > >>>>>> Hi,
> > > >> >> > > > >>>>>>
> > > >> >> > > > >>>>>> this is my first time trying to run a job on a
> > multinode
> > > >> >> > > > >>>>>>environment. I
> > > >> >> > > > >>>>>> have the cluster set up, I can see in the GUI that
> all
> > > >> >>nodes
> > > >> >> are
> > > >> >> > > > >>>>>> working.
> > > >> >> > > > >>>>>> Do I need to have the job folder on each machine in
> my
> > > >> >> cluster?
> > > >> >> > > > >>>>>> - The first time I tried running with the job on the
> > > >> >>namenode
> > > >> >> > > > >>>>>>machine
> > > >> >> > > > >>>>>> and
> > > >> >> > > > >>>>>> it failed saying:
> > > >> >> > > > >>>>>>
> > > >> >> > > > >>>>>> Application application_1407509228798_0001 failed 2
> > > >>times
> > > >> >>due
> > > >> >> to
> > > >> >> > > AM
> > > >> >> > > > >>>>>> Container for appattempt_1407509228798_0001_000002
> > > >>exited
> > > >> >>with
> > > >> >> > > > >>>>>>exitCode:
> > > >> >> > > > >>>>>> -1000 due to: File
> > > >> >> > > > >>>>>>
> > > >> >> > > > >>>>>>
> > > >> >> > > > >>>>>>
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>>>>>>file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-
> > > >>>>>>>>>>pa
> > > >> >>>>>>>>ck
> > > >> >> > > > >>>>>>age-
> > > >> >> > > > >>>>>> 0.7.0-dist.tar.gz
> > > >> >> > > > >>>>>> does not exist
> > > >> >> > > > >>>>>>
> > > >> >> > > > >>>>>> So I copied the folder to each machine in my cluster
> > and
> > > >> >>got
> > > >> >> > this
> > > >> >> > > > >>>>>>error:
> > > >> >> > > > >>>>>>
> > > >> >> > > > >>>>>> Application application_1407509228798_0002 failed 2
> > > >>times
> > > >> >>due
> > > >> >> to
> > > >> >> > > AM
> > > >> >> > > > >>>>>> Container for appattempt_1407509228798_0002_000002
> > > >>exited
> > > >> >>with
> > > >> >> > > > >>>>>>exitCode:
> > > >> >> > > > >>>>>> -1000 due to: Resource
> > > >> >> > > > >>>>>>
> > > >> >> > > > >>>>>>
> > > >> >> > > > >>>>>>
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >>>>>>>>>>file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-
> > > >>>>>>>>>>pa
> > > >> >>>>>>>>ck
> > > >> >> > > > >>>>>>age-
> > > >> >> > > > >>>>>> 0.7.0-dist.tar.gz
> > > >> >> > > > >>>>>> changed on src filesystem (expected 1407509168000,
> was
> > > >> >> > > 1407509434000
> > > >> >> > > > >>>>>>
> > > >> >> > > > >>>>>> What am I missing?
> > > >> >> > > > >>>>>>
> > > >> >> > > > >>>>>> p.s.: I followed this
> > > >> >> > > > >>>>>>
> > > >> >> > > > >>>>>><
> > > >> >> > > >
> > > >> >>
> > https://github.com/yahoo/samoa/wiki/Executing-SAMOA-with-Apache-Samz
> > > >> >> > > > >>>>>>a>
> > > >> >> > > > >>>>>> tutorial
> > > >> >> > > > >>>>>> and this
> > > >> >> > > > >>>>>> <
> > > >> >> > > > >>>>>>
> > > >> >> > > > >>>>>>
> > > >> >> > > > >>>>>>
> > > >> >> > > >
> > > >> >>
> > > http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-
> > > >> >> > > > >>>>>>node
> > > >> >> > > > >>>>>> -yarn.html
> > > >> >> > > > >>>>>>>
> > > >> >> > > > >>>>>> to
> > > >> >> > > > >>>>>> set up the cluster.
> > > >> >> > > > >>>>>>
> > > >> >> > > > >>>>>> Help is much appreciated.
> > > >> >> > > > >>>>>>
> > > >> >> > > > >>>>>> Thanks in advance.
> > > >> >> > > > >>>>>>
> > > >> >> > > > >>>>>> --
> > > >> >> > > > >>>>>> ------------------------------------------
> > > >> >> > > > >>>>>> Telles Mota Vidal Nobrega
> > > >> >> > > > >>>>>> M.sc. Candidate at UFCG
> > > >> >> > > > >>>>>> B.sc. in Computer Science at UFCG
> > > >> >> > > > >>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
> > > >> >> > > > >>>>>>
> > > >> >> > > > >>>>
> > > >> >> > > > >>>
> > > >> >> > > > >>
> > > >> >> > > > >
> > > >> >> > > >
> > > >> >> > > >
> > > >> >> > >
> > > >> >> >
> > > >> >> >
> > > >> >> >
> > > >> >> > --
> > > >> >> > ------------------------------------------
> > > >> >> > Telles Mota Vidal Nobrega
> > > >> >> > M.sc. Candidate at UFCG
> > > >> >> > B.sc. in Computer Science at UFCG
> > > >> >> > Software Engineer at OpenStack Project - HP/LSD-UFCG
> > > >> >> >
> > > >> >>
> > > >> >
> > > >> >
> > > >> >
> > > >> >--
> > > >> >------------------------------------------
> > > >> >Telles Mota Vidal Nobrega
> > > >> >M.sc. Candidate at UFCG
> > > >> >B.sc. in Computer Science at UFCG
> > > >> >Software Engineer at OpenStack Project - HP/LSD-UFCG
> > > >>
> > > >>
> > > >
> > > >
> > > >--
> > > >------------------------------------------
> > > >Telles Mota Vidal Nobrega
> > > >M.sc. Candidate at UFCG
> > > >B.sc. in Computer Science at UFCG
> > > >Software Engineer at OpenStack Project - HP/LSD-UFCG
> > >
> > >
> >
>
>
>
> --
> ------------------------------------------
> Telles Mota Vidal Nobrega
> M.sc. Candidate at UFCG
> B.sc. in Computer Science at UFCG
> Software Engineer at OpenStack Project - HP/LSD-UFCG
>

Re: Running Job on Multinode Yarn Cluster

Posted by Telles Nobrega <te...@gmail.com>.
Sure, thanks.


On Mon, Aug 11, 2014 at 6:22 PM, Yan Fang <ya...@gmail.com> wrote:

> Hi Telles,
>
> I am not sure whether exporting the CLASSPATH works. (sometimes it does not
> work for me...) My suggestion is to include the hdfs jar explicitly in the
> package that you upload to hdfs. Also , remember to put the jar into your
> local samza (which is deploy/samza/lib if you go with the hello-samza
> tutorial) Let me know if that works.
>
> Cheers,
>
> Fang, Yan
> yanfang724@gmail.com
> +1 (206) 849-4108
>
>
> On Mon, Aug 11, 2014 at 2:04 PM, Chris Riccomini <
> criccomini@linkedin.com.invalid> wrote:
>
> > Hey Telles,
> >
> > Hmm. I'm out of ideas. If Zhijie is around, he'd probably be of use, but
> I
> > haven't heard from him in a while.
> >
> > I'm afraid your best bet is probably to email the YARN dev mailing list,
> > since this is a YARN config issue.
> >
> > Cheers,
> > Chris
> >
> > On 8/11/14 1:58 PM, "Telles Nobrega" <te...@gmail.com> wrote:
> >
> > >​I exported ​export
> >
> >CLASSPATH=$CLASSPATH:hadoop-2.3.0/share/hadoop/hdfs/hadoop-hdfs-2.3.0.jar
> > >and still happened the same problem.
> > >
> > >
> > >On Mon, Aug 11, 2014 at 5:35 PM, Chris Riccomini <
> > >criccomini@linkedin.com.invalid> wrote:
> > >
> > >> Hey Telles,
> > >>
> > >> It sounds like either the HDFS jar is missing from the classpath, or
> the
> > >> hdfs file system needs to be configured:
> > >>
> > >> <property>
> > >>   <name>fs.hdfs.impl</name>
> > >>   <value>org.apache.hadoop.hdfs.DistributedFileSystem</value>
> > >>   <description>The FileSystem for hdfs: uris.</description>
> > >> </property>
> > >>
> > >>
> > >> (from
> > >>
> > >>
> >
> https://groups.google.com/a/cloudera.org/forum/#!topic/scm-users/lyho8ptA
> > >>zE
> > >> 0)
> > >>
> > >> I believe this will need to be configured for your NM.
> > >>
> > >> Cheers,
> > >> Chris
> > >>
> > >> On 8/11/14 1:31 PM, "Telles Nobrega" <te...@gmail.com> wrote:
> > >>
> > >> >Yes, it is like this:
> > >> >
> > >> ><configuration>
> > >> >  <property>
> > >> >    <name>dfs.datanode.data.dir</name>
> > >> >    <value>file:///home/ubuntu/hadoop-2.3.0/hdfs/datanode</value>
> > >> >    <description>Comma separated list of paths on the local
> filesystem
> > >>of
> > >> >a
> > >> >DataNode where it should store its blocks.</description>
> > >> >  </property>
> > >> >
> > >> >  <property>
> > >> >    <name>dfs.namenode.name.dir</name>
> > >> >    <value>file:///home/ubuntu/hadoop-2.3.0/hdfs/namenode</value>
> > >> >    <description>Path on the local filesystem where the NameNode
> stores
> > >> >the
> > >> >namespace and transaction logs persistently.</description>
> > >> >  </property>
> > >> ></configuration>
> > >> >~
> > >> >
> > >> >I saw some report that this may be a classpath problem. Does this
> > >>sounds
> > >> >right to you?
> > >> >
> > >> >
> > >> >On Mon, Aug 11, 2014 at 5:25 PM, Yan Fang <ya...@gmail.com>
> > wrote:
> > >> >
> > >> >> Hi Telles,
> > >> >>
> > >> >> It looks correct. Did you put the hdfs-site.xml into your
> > >> >>HADOOP_CONF_DIR
> > >> >> ?(such as ~/.samza/conf)
> > >> >>
> > >> >> Fang, Yan
> > >> >> yanfang724@gmail.com
> > >> >> +1 (206) 849-4108
> > >> >>
> > >> >>
> > >> >> On Mon, Aug 11, 2014 at 1:02 PM, Telles Nobrega
> > >> >><te...@gmail.com>
> > >> >> wrote:
> > >> >>
> > >> >> > ​Hi Yan Fang,
> > >> >> >
> > >> >> > I was able to deploy the file to hdfs, I can see them in all my
> > >>nodes
> > >> >>but
> > >> >> > when I tried running I got this error:
> > >> >> >
> > >> >> > Exception in thread "main" java.io.IOException: No FileSystem for
> > >> >>scheme:
> > >> >> > hdfs
> > >> >> > at
> > >> >>
> >
> >>org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2421)
> > >> >> >  at
> > >> >>
> > >>org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428)
> > >> >> > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
> > >> >> >  at
> > >> >>
> > >>org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
> > >> >> > at
> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
> > >> >> >  at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
> > >> >> > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
> > >> >> >  at
> > >> >> >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.s
> > >>>>ca
> > >> >>la:111)
> > >> >> > at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
> > >> >> >  at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
> > >> >> > at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
> > >> >> >  at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
> > >> >> > at org.apache.samza.job.JobRunner.main(JobRunner.scala)
> > >> >> >
> > >> >> >
> > >> >> > This is my yarn.package.path config:
> > >> >> >
> > >> >> >
> > >> >> >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>​yarn.package.path=hdfs://telles-master-samza:50070/samza-job-package-0
> > >>>>.7
> > >> >>.0-dist.tar.gz
> > >> >> >
> > >> >> > Thanks in advance
> > >> >> >
> > >> >> >
> > >> >> >
> > >> >> >
> > >> >> >
> > >> >> > On Mon, Aug 11, 2014 at 3:00 PM, Yan Fang <ya...@gmail.com>
> > >> >>wrote:
> > >> >> >
> > >> >> > > Hi Telles,
> > >> >> > >
> > >> >> > > In terms of "*I tried pushing the tar file to HDFS but I got an
> > >> >>error
> > >> >> > from
> > >> >> > > hadoop saying that it couldn’t find core-site.xml file*.", I
> > >>guess
> > >> >>you
> > >> >> > set
> > >> >> > > the HADOOP_CONF_DIR variable and made it point to
> ~/.samza/conf.
> > >>You
> > >> >> can
> > >> >> > do
> > >> >> > > 1) make the HADOOP_CONF_DIR point to the directory where your
> > >>conf
> > >> >> files
> > >> >> > > are, such as /etc/hadoop/conf. Or 2) copy the config files to
> > >> >> > > ~/.samza/conf. Thank you,
> > >> >> > >
> > >> >> > > Cheer,
> > >> >> > >
> > >> >> > > Fang, Yan
> > >> >> > > yanfang724@gmail.com
> > >> >> > > +1 (206) 849-4108
> > >> >> > >
> > >> >> > >
> > >> >> > > On Mon, Aug 11, 2014 at 7:40 AM, Chris Riccomini <
> > >> >> > > criccomini@linkedin.com.invalid> wrote:
> > >> >> > >
> > >> >> > > > Hey Telles,
> > >> >> > > >
> > >> >> > > > To get YARN working with the HTTP file system, you need to
> > >>follow
> > >> >>the
> > >> >> > > > instructions on:
> > >> >> > > >
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >> >>
> > >>
> > >>
> >
> http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-node
> > >> >>-y
> > >> >> > > > arn.html
> > >> >> > > >
> > >> >> > > >
> > >> >> > > > In the "Set Up Http Filesystem for YARN" section.
> > >> >> > > >
> > >> >> > > > You shouldn't need to compile anything (no Gradle, which is
> > >>what
> > >> >>your
> > >> >> > > > stack trace is showing). This setup should be done for all of
> > >>the
> > >> >> NMs,
> > >> >> > > > since they will be the ones downloading your job's package
> > >>(from
> > >> >> > > > yarn.package.path).
> > >> >> > > >
> > >> >> > > > Cheers,
> > >> >> > > > Chris
> > >> >> > > >
> > >> >> > > > On 8/9/14 9:44 PM, "Telles Nobrega" <tellesnobrega@gmail.com
> >
> > >> >>wrote:
> > >> >> > > >
> > >> >> > > > >Hi again, I tried installing the scala libs but the Http
> > >>problem
> > >> >> still
> > >> >> > > > >occurs. I realised that I need to compile incubator samza in
> > >>the
> > >> >> > > machines
> > >> >> > > > >that I¹m going to run the jobs, but the compilation fails
> with
> > >> >>this
> > >> >> > huge
> > >> >> > > > >message:
> > >> >> > > > >
> > >> >> > > > >#
> > >> >> > > > ># There is insufficient memory for the Java Runtime
> > >>Environment
> > >> >>to
> > >> >> > > > >continue.
> > >> >> > > > ># Native memory allocation (malloc) failed to allocate
> > >>3946053632
> > >> >> > bytes
> > >> >> > > > >for committing reserved memory.
> > >> >> > > > ># An error report file with more information is saved as:
> > >> >> > > > >#
> /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2506.log
> > >> >> > > > >Could not write standard input into: Gradle Worker 13.
> > >> >> > > > >java.io.IOException: Broken pipe
> > >> >> > > > >       at java.io.FileOutputStream.writeBytes(Native Method)
> > >> >> > > > >       at
> > >> >>java.io.FileOutputStream.write(FileOutputStream.java:345)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >>
> > >>java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> > >> >> > > > >       at
> > >> >> > > >
> > >>java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOut
> > >>>>>pu
> > >> >>>tH
> > >> >> > > > >andleRunner.java:53)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecuto
> > >>>>>rI
> > >> >>>mp
> > >> >> > > > >l$1.run(DefaultExecutorFactory.java:66)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
> > >>>>>av
> > >> >>>a:
> > >> >> > > > >1145)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> > >>>>>ja
> > >> >>>va
> > >> >> > > > >:615)
> > >> >> > > > >       at java.lang.Thread.run(Thread.java:744)
> > >> >> > > > >Process 'Gradle Worker 13' finished with non-zero exit
> value 1
> > >> >> > > > >org.gradle.process.internal.ExecException: Process 'Gradle
> > >>Worker
> > >> >> 13'
> > >> >> > > > >finished with non-zero exit value 1
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNor
> > >>>>>ma
> > >> >>>lE
> > >> >> > > > >xitValue(DefaultExecHandle.java:362)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(Default
> > >>>>>Wo
> > >> >>>rk
> > >> >> > > > >erProcess.java:89)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWor
> > >>>>>ke
> > >> >>>rP
> > >> >> > > > >rocess.java:33)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(D
> > >>>>>ef
> > >> >>>au
> > >> >> > > > >ltWorkerProcess.java:55)
> > >> >> > > > >       at
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> > >> >> Method)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
> > >>>>>av
> > >> >>>a:
> > >> >> > > > >57)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> > >>>>>or
> > >> >>>Im
> > >> >> > > > >pl.java:43)
> > >> >> > > > >       at java.lang.reflect.Method.invoke(Method.java:606)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDi
> > >>>>>sp
> > >> >>>at
> > >> >> > > > >ch.java:35)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDi
> > >>>>>sp
> > >> >>>at
> > >> >> > > > >ch.java:24)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:
> > >>>>>81
> > >> >>>)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:
> > >>>>>30
> > >> >>>)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocati
> > >>>>>on
> > >> >>>Ha
> > >> >> > > > >ndler.invoke(ProxyDispatchAdapter.java:93)
> > >> >> > > > >       at com.sun.proxy.$Proxy46.executionFinished(Unknown
> > >> >>Source)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultE
> > >>>>>xe
> > >> >>>cH
> > >> >> > > > >andle.java:212)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHand
> > >>>>>le
> > >> >>>.j
> > >> >> > > > >ava:309)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunne
> > >>>>>r.
> > >> >>>ja
> > >> >> > > > >va:108)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java
> > >>>>>:8
> > >> >>>8)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecuto
> > >>>>>rI
> > >> >>>mp
> > >> >> > > > >l$1.run(DefaultExecutorFactory.java:66)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
> > >>>>>av
> > >> >>>a:
> > >> >> > > > >1145)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> > >>>>>ja
> > >> >>>va
> > >> >> > > > >:615)
> > >> >> > > > >       at java.lang.Thread.run(Thread.java:744)
> > >> >> > > > >OpenJDK 64-Bit Server VM warning: INFO:
> > >> >> > > > >os::commit_memory(0x000000070a6c0000, 3946053632, 0) failed;
> > >> >> > > > >error='Cannot allocate memory' (errno=12)
> > >> >> > > > >#
> > >> >> > > > ># There is insufficient memory for the Java Runtime
> > >>Environment
> > >> >>to
> > >> >> > > > >continue.
> > >> >> > > > ># Native memory allocation (malloc) failed to allocate
> > >>3946053632
> > >> >> > bytes
> > >> >> > > > >for committing reserved memory.
> > >> >> > > > ># An error report file with more information is saved as:
> > >> >> > > > >#
> /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2518.log
> > >> >> > > > >Could not write standard input into: Gradle Worker 14.
> > >> >> > > > >java.io.IOException: Broken pipe
> > >> >> > > > >       at java.io.FileOutputStream.writeBytes(Native Method)
> > >> >> > > > >       at
> > >> >>java.io.FileOutputStream.write(FileOutputStream.java:345)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >>
> > >>java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> > >> >> > > > >       at
> > >> >> > > >
> > >>java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOut
> > >>>>>pu
> > >> >>>tH
> > >> >> > > > >andleRunner.java:53)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecuto
> > >>>>>rI
> > >> >>>mp
> > >> >> > > > >l$1.run(DefaultExecutorFactory.java:66)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
> > >>>>>av
> > >> >>>a:
> > >> >> > > > >1145)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> > >>>>>ja
> > >> >>>va
> > >> >> > > > >:615)
> > >> >> > > > >       at java.lang.Thread.run(Thread.java:744)
> > >> >> > > > >Process 'Gradle Worker 14' finished with non-zero exit
> value 1
> > >> >> > > > >org.gradle.process.internal.ExecException: Process 'Gradle
> > >>Worker
> > >> >> 14'
> > >> >> > > > >finished with non-zero exit value 1
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNor
> > >>>>>ma
> > >> >>>lE
> > >> >> > > > >xitValue(DefaultExecHandle.java:362)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(Default
> > >>>>>Wo
> > >> >>>rk
> > >> >> > > > >erProcess.java:89)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWor
> > >>>>>ke
> > >> >>>rP
> > >> >> > > > >rocess.java:33)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(D
> > >>>>>ef
> > >> >>>au
> > >> >> > > > >ltWorkerProcess.java:55)
> > >> >> > > > >       at
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> > >> >> Method)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
> > >>>>>av
> > >> >>>a:
> > >> >> > > > >57)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> > >>>>>or
> > >> >>>Im
> > >> >> > > > >pl.java:43)
> > >> >> > > > >       at java.lang.reflect.Method.invoke(Method.java:606)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDi
> > >>>>>sp
> > >> >>>at
> > >> >> > > > >ch.java:35)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDi
> > >>>>>sp
> > >> >>>at
> > >> >> > > > >ch.java:24)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:
> > >>>>>81
> > >> >>>)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:
> > >>>>>30
> > >> >>>)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocati
> > >>>>>on
> > >> >>>Ha
> > >> >> > > > >ndler.invoke(ProxyDispatchAdapter.java:93)
> > >> >> > > > >       at com.sun.proxy.$Proxy46.executionFinished(Unknown
> > >> >>Source)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultE
> > >>>>>xe
> > >> >>>cH
> > >> >> > > > >andle.java:212)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHand
> > >>>>>le
> > >> >>>.j
> > >> >> > > > >ava:309)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunne
> > >>>>>r.
> > >> >>>ja
> > >> >> > > > >va:108)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java
> > >>>>>:8
> > >> >>>8)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecuto
> > >>>>>rI
> > >> >>>mp
> > >> >> > > > >l$1.run(DefaultExecutorFactory.java:66)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
> > >>>>>av
> > >> >>>a:
> > >> >> > > > >1145)
> > >> >> > > > >       at
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> > >>>>>ja
> > >> >>>va
> > >> >> > > > >:615)
> > >> >> > > > >       at java.lang.Thread.r
> > >> >> > > > >
> > >> >> > > > >Do I need more memory for my machines? Each already has
> 4GB. I
> > >> >> really
> > >> >> > > > >need to have this running. I¹m not sure which way is best
> > >>http or
> > >> >> hdfs
> > >> >> > > > >which one you suggest and how can i solve my problem for
> each
> > >> >>case.
> > >> >> > > > >
> > >> >> > > > >Thanks in advance and sorry for bothering this much.
> > >> >> > > > >On 10 Aug 2014, at 00:20, Telles Nobrega
> > >> >><te...@gmail.com>
> > >> >> > > wrote:
> > >> >> > > > >
> > >> >> > > > >> Hi Chris, now I have the tar file in my RM machine, and
> the
> > >> >>yarn
> > >> >> > path
> > >> >> > > > >>points to it. I changed the core-site.xml to use
> > >>HttpFileSystem
> > >> >> > instead
> > >> >> > > > >>of HDFS now it is failing with
> > >> >> > > > >>
> > >> >> > > > >> Application application_1407640485281_0001 failed 2 times
> > >>due
> > >> >>to
> > >> >> AM
> > >> >> > > > >>Container for appattempt_1407640485281_0001_000002 exited
> > >>with
> > >> >> > > > >>exitCode:-1000 due to: java.lang.ClassNotFoundException:
> > >>Class
> > >> >> > > > >>org.apache.samza.util.hadoop.HttpFileSystem not found
> > >> >> > > > >>
> > >> >> > > > >> I think I can solve this just installing scala files from
> > >>the
> > >> >> samza
> > >> >> > > > >>tutorial, can you confirm that?
> > >> >> > > > >>
> > >> >> > > > >> On 09 Aug 2014, at 08:34, Telles Nobrega
> > >> >><tellesnobrega@gmail.com
> > >> >> >
> > >> >> > > > >>wrote:
> > >> >> > > > >>
> > >> >> > > > >>> Hi Chris,
> > >> >> > > > >>>
> > >> >> > > > >>> I think the problem is that I forgot to update the
> > >> >> > yarn.job.package.
> > >> >> > > > >>> I will try again to see if it works now.
> > >> >> > > > >>>
> > >> >> > > > >>> I have one more question, how can I stop (command line)
> the
> > >> >>jobs
> > >> >> > > > >>>running in my topology, for the experiment that I will
> run,
> > >>I
> > >> >>need
> > >> >> > to
> > >> >> > > > >>>run the same job in 4 minutes intervals. So I need to kill
> > >>it,
> > >> >> clean
> > >> >> > > > >>>the kafka topics and rerun.
> > >> >> > > > >>>
> > >> >> > > > >>> Thanks in advance.
> > >> >> > > > >>>
> > >> >> > > > >>> On 08 Aug 2014, at 12:41, Chris Riccomini
> > >> >> > > > >>><cr...@linkedin.com.INVALID> wrote:
> > >> >> > > > >>>
> > >> >> > > > >>>> Hey Telles,
> > >> >> > > > >>>>
> > >> >> > > > >>>>>> Do I need to have the job folder on each machine in my
> > >> >> cluster?
> > >> >> > > > >>>>
> > >> >> > > > >>>> No, you should not need to do this. There are two ways
> to
> > >> >>deploy
> > >> >> > > your
> > >> >> > > > >>>> tarball to the YARN grid. One is to put it in HDFS, and
> > >>the
> > >> >> other
> > >> >> > is
> > >> >> > > > >>>>to
> > >> >> > > > >>>> put it on an HTTP server. The link to running a Samza
> job
> > >>in
> > >> >>a
> > >> >> > > > >>>>multi-node
> > >> >> > > > >>>> YARN cluster describes how to do both (either HTTP
> server
> > >>or
> > >> >> > HDFS).
> > >> >> > > > >>>>
> > >> >> > > > >>>> In both cases, once the tarball is put in on the
> HTTP/HDFS
> > >> >> > > server(s),
> > >> >> > > > >>>>you
> > >> >> > > > >>>> must update yarn.package.path to point to it. From
> there,
> > >>the
> > >> >> YARN
> > >> >> > > NM
> > >> >> > > > >>>> should download it for you automatically when you start
> > >>your
> > >> >> job.
> > >> >> > > > >>>>
> > >> >> > > > >>>> * Can you send along a paste of your job config?
> > >> >> > > > >>>>
> > >> >> > > > >>>> Cheers,
> > >> >> > > > >>>> Chris
> > >> >> > > > >>>>
> > >> >> > > > >>>> On 8/8/14 8:04 AM, "Claudio Martins"
> > >> >><cl...@mobileaware.com>
> > >> >> > > wrote:
> > >> >> > > > >>>>
> > >> >> > > > >>>>> Hi Telles, it looks to me that you forgot to update the
> > >> >> > > > >>>>> "yarn.package.path"
> > >> >> > > > >>>>> attribute in your config file for the task.
> > >> >> > > > >>>>>
> > >> >> > > > >>>>> - Claudio Martins
> > >> >> > > > >>>>> Head of Engineering
> > >> >> > > > >>>>> MobileAware USA Inc. / www.mobileaware.com
> > >> >> > > > >>>>> office: +1 617 986 5060 / mobile: +1 617 480 5288
> > >> >> > > > >>>>> linkedin: www.linkedin.com/in/martinsclaudio
> > >> >> > > > >>>>>
> > >> >> > > > >>>>>
> > >> >> > > > >>>>> On Fri, Aug 8, 2014 at 10:55 AM, Telles Nobrega
> > >> >> > > > >>>>><te...@gmail.com>
> > >> >> > > > >>>>> wrote:
> > >> >> > > > >>>>>
> > >> >> > > > >>>>>> Hi,
> > >> >> > > > >>>>>>
> > >> >> > > > >>>>>> this is my first time trying to run a job on a
> multinode
> > >> >> > > > >>>>>>environment. I
> > >> >> > > > >>>>>> have the cluster set up, I can see in the GUI that all
> > >> >>nodes
> > >> >> are
> > >> >> > > > >>>>>> working.
> > >> >> > > > >>>>>> Do I need to have the job folder on each machine in my
> > >> >> cluster?
> > >> >> > > > >>>>>> - The first time I tried running with the job on the
> > >> >>namenode
> > >> >> > > > >>>>>>machine
> > >> >> > > > >>>>>> and
> > >> >> > > > >>>>>> it failed saying:
> > >> >> > > > >>>>>>
> > >> >> > > > >>>>>> Application application_1407509228798_0001 failed 2
> > >>times
> > >> >>due
> > >> >> to
> > >> >> > > AM
> > >> >> > > > >>>>>> Container for appattempt_1407509228798_0001_000002
> > >>exited
> > >> >>with
> > >> >> > > > >>>>>>exitCode:
> > >> >> > > > >>>>>> -1000 due to: File
> > >> >> > > > >>>>>>
> > >> >> > > > >>>>>>
> > >> >> > > > >>>>>>
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>>>>>>file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-
> > >>>>>>>>>>pa
> > >> >>>>>>>>ck
> > >> >> > > > >>>>>>age-
> > >> >> > > > >>>>>> 0.7.0-dist.tar.gz
> > >> >> > > > >>>>>> does not exist
> > >> >> > > > >>>>>>
> > >> >> > > > >>>>>> So I copied the folder to each machine in my cluster
> and
> > >> >>got
> > >> >> > this
> > >> >> > > > >>>>>>error:
> > >> >> > > > >>>>>>
> > >> >> > > > >>>>>> Application application_1407509228798_0002 failed 2
> > >>times
> > >> >>due
> > >> >> to
> > >> >> > > AM
> > >> >> > > > >>>>>> Container for appattempt_1407509228798_0002_000002
> > >>exited
> > >> >>with
> > >> >> > > > >>>>>>exitCode:
> > >> >> > > > >>>>>> -1000 due to: Resource
> > >> >> > > > >>>>>>
> > >> >> > > > >>>>>>
> > >> >> > > > >>>>>>
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> >>>>>>>>>>file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-
> > >>>>>>>>>>pa
> > >> >>>>>>>>ck
> > >> >> > > > >>>>>>age-
> > >> >> > > > >>>>>> 0.7.0-dist.tar.gz
> > >> >> > > > >>>>>> changed on src filesystem (expected 1407509168000, was
> > >> >> > > 1407509434000
> > >> >> > > > >>>>>>
> > >> >> > > > >>>>>> What am I missing?
> > >> >> > > > >>>>>>
> > >> >> > > > >>>>>> p.s.: I followed this
> > >> >> > > > >>>>>>
> > >> >> > > > >>>>>><
> > >> >> > > >
> > >> >>
> https://github.com/yahoo/samoa/wiki/Executing-SAMOA-with-Apache-Samz
> > >> >> > > > >>>>>>a>
> > >> >> > > > >>>>>> tutorial
> > >> >> > > > >>>>>> and this
> > >> >> > > > >>>>>> <
> > >> >> > > > >>>>>>
> > >> >> > > > >>>>>>
> > >> >> > > > >>>>>>
> > >> >> > > >
> > >> >>
> > http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-
> > >> >> > > > >>>>>>node
> > >> >> > > > >>>>>> -yarn.html
> > >> >> > > > >>>>>>>
> > >> >> > > > >>>>>> to
> > >> >> > > > >>>>>> set up the cluster.
> > >> >> > > > >>>>>>
> > >> >> > > > >>>>>> Help is much appreciated.
> > >> >> > > > >>>>>>
> > >> >> > > > >>>>>> Thanks in advance.
> > >> >> > > > >>>>>>
> > >> >> > > > >>>>>> --
> > >> >> > > > >>>>>> ------------------------------------------
> > >> >> > > > >>>>>> Telles Mota Vidal Nobrega
> > >> >> > > > >>>>>> M.sc. Candidate at UFCG
> > >> >> > > > >>>>>> B.sc. in Computer Science at UFCG
> > >> >> > > > >>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
> > >> >> > > > >>>>>>
> > >> >> > > > >>>>
> > >> >> > > > >>>
> > >> >> > > > >>
> > >> >> > > > >
> > >> >> > > >
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >> >
> > >> >> >
> > >> >> > --
> > >> >> > ------------------------------------------
> > >> >> > Telles Mota Vidal Nobrega
> > >> >> > M.sc. Candidate at UFCG
> > >> >> > B.sc. in Computer Science at UFCG
> > >> >> > Software Engineer at OpenStack Project - HP/LSD-UFCG
> > >> >> >
> > >> >>
> > >> >
> > >> >
> > >> >
> > >> >--
> > >> >------------------------------------------
> > >> >Telles Mota Vidal Nobrega
> > >> >M.sc. Candidate at UFCG
> > >> >B.sc. in Computer Science at UFCG
> > >> >Software Engineer at OpenStack Project - HP/LSD-UFCG
> > >>
> > >>
> > >
> > >
> > >--
> > >------------------------------------------
> > >Telles Mota Vidal Nobrega
> > >M.sc. Candidate at UFCG
> > >B.sc. in Computer Science at UFCG
> > >Software Engineer at OpenStack Project - HP/LSD-UFCG
> >
> >
>



-- 
------------------------------------------
Telles Mota Vidal Nobrega
M.sc. Candidate at UFCG
B.sc. in Computer Science at UFCG
Software Engineer at OpenStack Project - HP/LSD-UFCG

Re: Running Job on Multinode Yarn Cluster

Posted by Yan Fang <ya...@gmail.com>.
Hi Telles,

I am not sure whether exporting the CLASSPATH works. (sometimes it does not
work for me...) My suggestion is to include the hdfs jar explicitly in the
package that you upload to hdfs. Also , remember to put the jar into your
local samza (which is deploy/samza/lib if you go with the hello-samza
tutorial) Let me know if that works.

Cheers,

Fang, Yan
yanfang724@gmail.com
+1 (206) 849-4108


On Mon, Aug 11, 2014 at 2:04 PM, Chris Riccomini <
criccomini@linkedin.com.invalid> wrote:

> Hey Telles,
>
> Hmm. I'm out of ideas. If Zhijie is around, he'd probably be of use, but I
> haven't heard from him in a while.
>
> I'm afraid your best bet is probably to email the YARN dev mailing list,
> since this is a YARN config issue.
>
> Cheers,
> Chris
>
> On 8/11/14 1:58 PM, "Telles Nobrega" <te...@gmail.com> wrote:
>
> >​I exported ​export
> >CLASSPATH=$CLASSPATH:hadoop-2.3.0/share/hadoop/hdfs/hadoop-hdfs-2.3.0.jar
> >and still happened the same problem.
> >
> >
> >On Mon, Aug 11, 2014 at 5:35 PM, Chris Riccomini <
> >criccomini@linkedin.com.invalid> wrote:
> >
> >> Hey Telles,
> >>
> >> It sounds like either the HDFS jar is missing from the classpath, or the
> >> hdfs file system needs to be configured:
> >>
> >> <property>
> >>   <name>fs.hdfs.impl</name>
> >>   <value>org.apache.hadoop.hdfs.DistributedFileSystem</value>
> >>   <description>The FileSystem for hdfs: uris.</description>
> >> </property>
> >>
> >>
> >> (from
> >>
> >>
> https://groups.google.com/a/cloudera.org/forum/#!topic/scm-users/lyho8ptA
> >>zE
> >> 0)
> >>
> >> I believe this will need to be configured for your NM.
> >>
> >> Cheers,
> >> Chris
> >>
> >> On 8/11/14 1:31 PM, "Telles Nobrega" <te...@gmail.com> wrote:
> >>
> >> >Yes, it is like this:
> >> >
> >> ><configuration>
> >> >  <property>
> >> >    <name>dfs.datanode.data.dir</name>
> >> >    <value>file:///home/ubuntu/hadoop-2.3.0/hdfs/datanode</value>
> >> >    <description>Comma separated list of paths on the local filesystem
> >>of
> >> >a
> >> >DataNode where it should store its blocks.</description>
> >> >  </property>
> >> >
> >> >  <property>
> >> >    <name>dfs.namenode.name.dir</name>
> >> >    <value>file:///home/ubuntu/hadoop-2.3.0/hdfs/namenode</value>
> >> >    <description>Path on the local filesystem where the NameNode stores
> >> >the
> >> >namespace and transaction logs persistently.</description>
> >> >  </property>
> >> ></configuration>
> >> >~
> >> >
> >> >I saw some report that this may be a classpath problem. Does this
> >>sounds
> >> >right to you?
> >> >
> >> >
> >> >On Mon, Aug 11, 2014 at 5:25 PM, Yan Fang <ya...@gmail.com>
> wrote:
> >> >
> >> >> Hi Telles,
> >> >>
> >> >> It looks correct. Did you put the hdfs-site.xml into your
> >> >>HADOOP_CONF_DIR
> >> >> ?(such as ~/.samza/conf)
> >> >>
> >> >> Fang, Yan
> >> >> yanfang724@gmail.com
> >> >> +1 (206) 849-4108
> >> >>
> >> >>
> >> >> On Mon, Aug 11, 2014 at 1:02 PM, Telles Nobrega
> >> >><te...@gmail.com>
> >> >> wrote:
> >> >>
> >> >> > ​Hi Yan Fang,
> >> >> >
> >> >> > I was able to deploy the file to hdfs, I can see them in all my
> >>nodes
> >> >>but
> >> >> > when I tried running I got this error:
> >> >> >
> >> >> > Exception in thread "main" java.io.IOException: No FileSystem for
> >> >>scheme:
> >> >> > hdfs
> >> >> > at
> >> >>
> >>org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2421)
> >> >> >  at
> >> >>
> >>org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428)
> >> >> > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
> >> >> >  at
> >> >>
> >>org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
> >> >> > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
> >> >> >  at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
> >> >> > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
> >> >> >  at
> >> >> >
> >> >> >
> >> >>
> >>
> >>>>org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.s
> >>>>ca
> >> >>la:111)
> >> >> > at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
> >> >> >  at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
> >> >> > at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
> >> >> >  at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
> >> >> > at org.apache.samza.job.JobRunner.main(JobRunner.scala)
> >> >> >
> >> >> >
> >> >> > This is my yarn.package.path config:
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >>
> >>
> >>>>​yarn.package.path=hdfs://telles-master-samza:50070/samza-job-package-0
> >>>>.7
> >> >>.0-dist.tar.gz
> >> >> >
> >> >> > Thanks in advance
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> > On Mon, Aug 11, 2014 at 3:00 PM, Yan Fang <ya...@gmail.com>
> >> >>wrote:
> >> >> >
> >> >> > > Hi Telles,
> >> >> > >
> >> >> > > In terms of "*I tried pushing the tar file to HDFS but I got an
> >> >>error
> >> >> > from
> >> >> > > hadoop saying that it couldn’t find core-site.xml file*.", I
> >>guess
> >> >>you
> >> >> > set
> >> >> > > the HADOOP_CONF_DIR variable and made it point to ~/.samza/conf.
> >>You
> >> >> can
> >> >> > do
> >> >> > > 1) make the HADOOP_CONF_DIR point to the directory where your
> >>conf
> >> >> files
> >> >> > > are, such as /etc/hadoop/conf. Or 2) copy the config files to
> >> >> > > ~/.samza/conf. Thank you,
> >> >> > >
> >> >> > > Cheer,
> >> >> > >
> >> >> > > Fang, Yan
> >> >> > > yanfang724@gmail.com
> >> >> > > +1 (206) 849-4108
> >> >> > >
> >> >> > >
> >> >> > > On Mon, Aug 11, 2014 at 7:40 AM, Chris Riccomini <
> >> >> > > criccomini@linkedin.com.invalid> wrote:
> >> >> > >
> >> >> > > > Hey Telles,
> >> >> > > >
> >> >> > > > To get YARN working with the HTTP file system, you need to
> >>follow
> >> >>the
> >> >> > > > instructions on:
> >> >> > > >
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >> >>
> >>
> >>
> http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-node
> >> >>-y
> >> >> > > > arn.html
> >> >> > > >
> >> >> > > >
> >> >> > > > In the "Set Up Http Filesystem for YARN" section.
> >> >> > > >
> >> >> > > > You shouldn't need to compile anything (no Gradle, which is
> >>what
> >> >>your
> >> >> > > > stack trace is showing). This setup should be done for all of
> >>the
> >> >> NMs,
> >> >> > > > since they will be the ones downloading your job's package
> >>(from
> >> >> > > > yarn.package.path).
> >> >> > > >
> >> >> > > > Cheers,
> >> >> > > > Chris
> >> >> > > >
> >> >> > > > On 8/9/14 9:44 PM, "Telles Nobrega" <te...@gmail.com>
> >> >>wrote:
> >> >> > > >
> >> >> > > > >Hi again, I tried installing the scala libs but the Http
> >>problem
> >> >> still
> >> >> > > > >occurs. I realised that I need to compile incubator samza in
> >>the
> >> >> > > machines
> >> >> > > > >that I¹m going to run the jobs, but the compilation fails with
> >> >>this
> >> >> > huge
> >> >> > > > >message:
> >> >> > > > >
> >> >> > > > >#
> >> >> > > > ># There is insufficient memory for the Java Runtime
> >>Environment
> >> >>to
> >> >> > > > >continue.
> >> >> > > > ># Native memory allocation (malloc) failed to allocate
> >>3946053632
> >> >> > bytes
> >> >> > > > >for committing reserved memory.
> >> >> > > > ># An error report file with more information is saved as:
> >> >> > > > ># /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2506.log
> >> >> > > > >Could not write standard input into: Gradle Worker 13.
> >> >> > > > >java.io.IOException: Broken pipe
> >> >> > > > >       at java.io.FileOutputStream.writeBytes(Native Method)
> >> >> > > > >       at
> >> >>java.io.FileOutputStream.write(FileOutputStream.java:345)
> >> >> > > > >       at
> >> >> > > >
> >> >>
> >>java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> >> >> > > > >       at
> >> >> > > >
> >>java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOut
> >>>>>pu
> >> >>>tH
> >> >> > > > >andleRunner.java:53)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecuto
> >>>>>rI
> >> >>>mp
> >> >> > > > >l$1.run(DefaultExecutorFactory.java:66)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
> >>>>>av
> >> >>>a:
> >> >> > > > >1145)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> >>>>>ja
> >> >>>va
> >> >> > > > >:615)
> >> >> > > > >       at java.lang.Thread.run(Thread.java:744)
> >> >> > > > >Process 'Gradle Worker 13' finished with non-zero exit value 1
> >> >> > > > >org.gradle.process.internal.ExecException: Process 'Gradle
> >>Worker
> >> >> 13'
> >> >> > > > >finished with non-zero exit value 1
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNor
> >>>>>ma
> >> >>>lE
> >> >> > > > >xitValue(DefaultExecHandle.java:362)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(Default
> >>>>>Wo
> >> >>>rk
> >> >> > > > >erProcess.java:89)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWor
> >>>>>ke
> >> >>>rP
> >> >> > > > >rocess.java:33)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(D
> >>>>>ef
> >> >>>au
> >> >> > > > >ltWorkerProcess.java:55)
> >> >> > > > >       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> >> >> Method)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
> >>>>>av
> >> >>>a:
> >> >> > > > >57)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> >>>>>or
> >> >>>Im
> >> >> > > > >pl.java:43)
> >> >> > > > >       at java.lang.reflect.Method.invoke(Method.java:606)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDi
> >>>>>sp
> >> >>>at
> >> >> > > > >ch.java:35)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDi
> >>>>>sp
> >> >>>at
> >> >> > > > >ch.java:24)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:
> >>>>>81
> >> >>>)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:
> >>>>>30
> >> >>>)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocati
> >>>>>on
> >> >>>Ha
> >> >> > > > >ndler.invoke(ProxyDispatchAdapter.java:93)
> >> >> > > > >       at com.sun.proxy.$Proxy46.executionFinished(Unknown
> >> >>Source)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultE
> >>>>>xe
> >> >>>cH
> >> >> > > > >andle.java:212)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHand
> >>>>>le
> >> >>>.j
> >> >> > > > >ava:309)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunne
> >>>>>r.
> >> >>>ja
> >> >> > > > >va:108)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java
> >>>>>:8
> >> >>>8)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecuto
> >>>>>rI
> >> >>>mp
> >> >> > > > >l$1.run(DefaultExecutorFactory.java:66)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
> >>>>>av
> >> >>>a:
> >> >> > > > >1145)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> >>>>>ja
> >> >>>va
> >> >> > > > >:615)
> >> >> > > > >       at java.lang.Thread.run(Thread.java:744)
> >> >> > > > >OpenJDK 64-Bit Server VM warning: INFO:
> >> >> > > > >os::commit_memory(0x000000070a6c0000, 3946053632, 0) failed;
> >> >> > > > >error='Cannot allocate memory' (errno=12)
> >> >> > > > >#
> >> >> > > > ># There is insufficient memory for the Java Runtime
> >>Environment
> >> >>to
> >> >> > > > >continue.
> >> >> > > > ># Native memory allocation (malloc) failed to allocate
> >>3946053632
> >> >> > bytes
> >> >> > > > >for committing reserved memory.
> >> >> > > > ># An error report file with more information is saved as:
> >> >> > > > ># /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2518.log
> >> >> > > > >Could not write standard input into: Gradle Worker 14.
> >> >> > > > >java.io.IOException: Broken pipe
> >> >> > > > >       at java.io.FileOutputStream.writeBytes(Native Method)
> >> >> > > > >       at
> >> >>java.io.FileOutputStream.write(FileOutputStream.java:345)
> >> >> > > > >       at
> >> >> > > >
> >> >>
> >>java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> >> >> > > > >       at
> >> >> > > >
> >>java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOut
> >>>>>pu
> >> >>>tH
> >> >> > > > >andleRunner.java:53)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecuto
> >>>>>rI
> >> >>>mp
> >> >> > > > >l$1.run(DefaultExecutorFactory.java:66)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
> >>>>>av
> >> >>>a:
> >> >> > > > >1145)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> >>>>>ja
> >> >>>va
> >> >> > > > >:615)
> >> >> > > > >       at java.lang.Thread.run(Thread.java:744)
> >> >> > > > >Process 'Gradle Worker 14' finished with non-zero exit value 1
> >> >> > > > >org.gradle.process.internal.ExecException: Process 'Gradle
> >>Worker
> >> >> 14'
> >> >> > > > >finished with non-zero exit value 1
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNor
> >>>>>ma
> >> >>>lE
> >> >> > > > >xitValue(DefaultExecHandle.java:362)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(Default
> >>>>>Wo
> >> >>>rk
> >> >> > > > >erProcess.java:89)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWor
> >>>>>ke
> >> >>>rP
> >> >> > > > >rocess.java:33)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(D
> >>>>>ef
> >> >>>au
> >> >> > > > >ltWorkerProcess.java:55)
> >> >> > > > >       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> >> >> Method)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
> >>>>>av
> >> >>>a:
> >> >> > > > >57)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> >>>>>or
> >> >>>Im
> >> >> > > > >pl.java:43)
> >> >> > > > >       at java.lang.reflect.Method.invoke(Method.java:606)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDi
> >>>>>sp
> >> >>>at
> >> >> > > > >ch.java:35)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDi
> >>>>>sp
> >> >>>at
> >> >> > > > >ch.java:24)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:
> >>>>>81
> >> >>>)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:
> >>>>>30
> >> >>>)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocati
> >>>>>on
> >> >>>Ha
> >> >> > > > >ndler.invoke(ProxyDispatchAdapter.java:93)
> >> >> > > > >       at com.sun.proxy.$Proxy46.executionFinished(Unknown
> >> >>Source)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultE
> >>>>>xe
> >> >>>cH
> >> >> > > > >andle.java:212)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHand
> >>>>>le
> >> >>>.j
> >> >> > > > >ava:309)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunne
> >>>>>r.
> >> >>>ja
> >> >> > > > >va:108)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java
> >>>>>:8
> >> >>>8)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecuto
> >>>>>rI
> >> >>>mp
> >> >> > > > >l$1.run(DefaultExecutorFactory.java:66)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
> >>>>>av
> >> >>>a:
> >> >> > > > >1145)
> >> >> > > > >       at
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
> >>>>>ja
> >> >>>va
> >> >> > > > >:615)
> >> >> > > > >       at java.lang.Thread.r
> >> >> > > > >
> >> >> > > > >Do I need more memory for my machines? Each already has 4GB. I
> >> >> really
> >> >> > > > >need to have this running. I¹m not sure which way is best
> >>http or
> >> >> hdfs
> >> >> > > > >which one you suggest and how can i solve my problem for each
> >> >>case.
> >> >> > > > >
> >> >> > > > >Thanks in advance and sorry for bothering this much.
> >> >> > > > >On 10 Aug 2014, at 00:20, Telles Nobrega
> >> >><te...@gmail.com>
> >> >> > > wrote:
> >> >> > > > >
> >> >> > > > >> Hi Chris, now I have the tar file in my RM machine, and the
> >> >>yarn
> >> >> > path
> >> >> > > > >>points to it. I changed the core-site.xml to use
> >>HttpFileSystem
> >> >> > instead
> >> >> > > > >>of HDFS now it is failing with
> >> >> > > > >>
> >> >> > > > >> Application application_1407640485281_0001 failed 2 times
> >>due
> >> >>to
> >> >> AM
> >> >> > > > >>Container for appattempt_1407640485281_0001_000002 exited
> >>with
> >> >> > > > >>exitCode:-1000 due to: java.lang.ClassNotFoundException:
> >>Class
> >> >> > > > >>org.apache.samza.util.hadoop.HttpFileSystem not found
> >> >> > > > >>
> >> >> > > > >> I think I can solve this just installing scala files from
> >>the
> >> >> samza
> >> >> > > > >>tutorial, can you confirm that?
> >> >> > > > >>
> >> >> > > > >> On 09 Aug 2014, at 08:34, Telles Nobrega
> >> >><tellesnobrega@gmail.com
> >> >> >
> >> >> > > > >>wrote:
> >> >> > > > >>
> >> >> > > > >>> Hi Chris,
> >> >> > > > >>>
> >> >> > > > >>> I think the problem is that I forgot to update the
> >> >> > yarn.job.package.
> >> >> > > > >>> I will try again to see if it works now.
> >> >> > > > >>>
> >> >> > > > >>> I have one more question, how can I stop (command line) the
> >> >>jobs
> >> >> > > > >>>running in my topology, for the experiment that I will run,
> >>I
> >> >>need
> >> >> > to
> >> >> > > > >>>run the same job in 4 minutes intervals. So I need to kill
> >>it,
> >> >> clean
> >> >> > > > >>>the kafka topics and rerun.
> >> >> > > > >>>
> >> >> > > > >>> Thanks in advance.
> >> >> > > > >>>
> >> >> > > > >>> On 08 Aug 2014, at 12:41, Chris Riccomini
> >> >> > > > >>><cr...@linkedin.com.INVALID> wrote:
> >> >> > > > >>>
> >> >> > > > >>>> Hey Telles,
> >> >> > > > >>>>
> >> >> > > > >>>>>> Do I need to have the job folder on each machine in my
> >> >> cluster?
> >> >> > > > >>>>
> >> >> > > > >>>> No, you should not need to do this. There are two ways to
> >> >>deploy
> >> >> > > your
> >> >> > > > >>>> tarball to the YARN grid. One is to put it in HDFS, and
> >>the
> >> >> other
> >> >> > is
> >> >> > > > >>>>to
> >> >> > > > >>>> put it on an HTTP server. The link to running a Samza job
> >>in
> >> >>a
> >> >> > > > >>>>multi-node
> >> >> > > > >>>> YARN cluster describes how to do both (either HTTP server
> >>or
> >> >> > HDFS).
> >> >> > > > >>>>
> >> >> > > > >>>> In both cases, once the tarball is put in on the HTTP/HDFS
> >> >> > > server(s),
> >> >> > > > >>>>you
> >> >> > > > >>>> must update yarn.package.path to point to it. From there,
> >>the
> >> >> YARN
> >> >> > > NM
> >> >> > > > >>>> should download it for you automatically when you start
> >>your
> >> >> job.
> >> >> > > > >>>>
> >> >> > > > >>>> * Can you send along a paste of your job config?
> >> >> > > > >>>>
> >> >> > > > >>>> Cheers,
> >> >> > > > >>>> Chris
> >> >> > > > >>>>
> >> >> > > > >>>> On 8/8/14 8:04 AM, "Claudio Martins"
> >> >><cl...@mobileaware.com>
> >> >> > > wrote:
> >> >> > > > >>>>
> >> >> > > > >>>>> Hi Telles, it looks to me that you forgot to update the
> >> >> > > > >>>>> "yarn.package.path"
> >> >> > > > >>>>> attribute in your config file for the task.
> >> >> > > > >>>>>
> >> >> > > > >>>>> - Claudio Martins
> >> >> > > > >>>>> Head of Engineering
> >> >> > > > >>>>> MobileAware USA Inc. / www.mobileaware.com
> >> >> > > > >>>>> office: +1 617 986 5060 / mobile: +1 617 480 5288
> >> >> > > > >>>>> linkedin: www.linkedin.com/in/martinsclaudio
> >> >> > > > >>>>>
> >> >> > > > >>>>>
> >> >> > > > >>>>> On Fri, Aug 8, 2014 at 10:55 AM, Telles Nobrega
> >> >> > > > >>>>><te...@gmail.com>
> >> >> > > > >>>>> wrote:
> >> >> > > > >>>>>
> >> >> > > > >>>>>> Hi,
> >> >> > > > >>>>>>
> >> >> > > > >>>>>> this is my first time trying to run a job on a multinode
> >> >> > > > >>>>>>environment. I
> >> >> > > > >>>>>> have the cluster set up, I can see in the GUI that all
> >> >>nodes
> >> >> are
> >> >> > > > >>>>>> working.
> >> >> > > > >>>>>> Do I need to have the job folder on each machine in my
> >> >> cluster?
> >> >> > > > >>>>>> - The first time I tried running with the job on the
> >> >>namenode
> >> >> > > > >>>>>>machine
> >> >> > > > >>>>>> and
> >> >> > > > >>>>>> it failed saying:
> >> >> > > > >>>>>>
> >> >> > > > >>>>>> Application application_1407509228798_0001 failed 2
> >>times
> >> >>due
> >> >> to
> >> >> > > AM
> >> >> > > > >>>>>> Container for appattempt_1407509228798_0001_000002
> >>exited
> >> >>with
> >> >> > > > >>>>>>exitCode:
> >> >> > > > >>>>>> -1000 due to: File
> >> >> > > > >>>>>>
> >> >> > > > >>>>>>
> >> >> > > > >>>>>>
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>>>>>>file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-
> >>>>>>>>>>pa
> >> >>>>>>>>ck
> >> >> > > > >>>>>>age-
> >> >> > > > >>>>>> 0.7.0-dist.tar.gz
> >> >> > > > >>>>>> does not exist
> >> >> > > > >>>>>>
> >> >> > > > >>>>>> So I copied the folder to each machine in my cluster and
> >> >>got
> >> >> > this
> >> >> > > > >>>>>>error:
> >> >> > > > >>>>>>
> >> >> > > > >>>>>> Application application_1407509228798_0002 failed 2
> >>times
> >> >>due
> >> >> to
> >> >> > > AM
> >> >> > > > >>>>>> Container for appattempt_1407509228798_0002_000002
> >>exited
> >> >>with
> >> >> > > > >>>>>>exitCode:
> >> >> > > > >>>>>> -1000 due to: Resource
> >> >> > > > >>>>>>
> >> >> > > > >>>>>>
> >> >> > > > >>>>>>
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >>>>>>>>>>file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-
> >>>>>>>>>>pa
> >> >>>>>>>>ck
> >> >> > > > >>>>>>age-
> >> >> > > > >>>>>> 0.7.0-dist.tar.gz
> >> >> > > > >>>>>> changed on src filesystem (expected 1407509168000, was
> >> >> > > 1407509434000
> >> >> > > > >>>>>>
> >> >> > > > >>>>>> What am I missing?
> >> >> > > > >>>>>>
> >> >> > > > >>>>>> p.s.: I followed this
> >> >> > > > >>>>>>
> >> >> > > > >>>>>><
> >> >> > > >
> >> >>https://github.com/yahoo/samoa/wiki/Executing-SAMOA-with-Apache-Samz
> >> >> > > > >>>>>>a>
> >> >> > > > >>>>>> tutorial
> >> >> > > > >>>>>> and this
> >> >> > > > >>>>>> <
> >> >> > > > >>>>>>
> >> >> > > > >>>>>>
> >> >> > > > >>>>>>
> >> >> > > >
> >> >>
> http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-
> >> >> > > > >>>>>>node
> >> >> > > > >>>>>> -yarn.html
> >> >> > > > >>>>>>>
> >> >> > > > >>>>>> to
> >> >> > > > >>>>>> set up the cluster.
> >> >> > > > >>>>>>
> >> >> > > > >>>>>> Help is much appreciated.
> >> >> > > > >>>>>>
> >> >> > > > >>>>>> Thanks in advance.
> >> >> > > > >>>>>>
> >> >> > > > >>>>>> --
> >> >> > > > >>>>>> ------------------------------------------
> >> >> > > > >>>>>> Telles Mota Vidal Nobrega
> >> >> > > > >>>>>> M.sc. Candidate at UFCG
> >> >> > > > >>>>>> B.sc. in Computer Science at UFCG
> >> >> > > > >>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
> >> >> > > > >>>>>>
> >> >> > > > >>>>
> >> >> > > > >>>
> >> >> > > > >>
> >> >> > > > >
> >> >> > > >
> >> >> > > >
> >> >> > >
> >> >> >
> >> >> >
> >> >> >
> >> >> > --
> >> >> > ------------------------------------------
> >> >> > Telles Mota Vidal Nobrega
> >> >> > M.sc. Candidate at UFCG
> >> >> > B.sc. in Computer Science at UFCG
> >> >> > Software Engineer at OpenStack Project - HP/LSD-UFCG
> >> >> >
> >> >>
> >> >
> >> >
> >> >
> >> >--
> >> >------------------------------------------
> >> >Telles Mota Vidal Nobrega
> >> >M.sc. Candidate at UFCG
> >> >B.sc. in Computer Science at UFCG
> >> >Software Engineer at OpenStack Project - HP/LSD-UFCG
> >>
> >>
> >
> >
> >--
> >------------------------------------------
> >Telles Mota Vidal Nobrega
> >M.sc. Candidate at UFCG
> >B.sc. in Computer Science at UFCG
> >Software Engineer at OpenStack Project - HP/LSD-UFCG
>
>

Re: Running Job on Multinode Yarn Cluster

Posted by Chris Riccomini <cr...@linkedin.com.INVALID>.
Hey Telles,

Hmm. I'm out of ideas. If Zhijie is around, he'd probably be of use, but I
haven't heard from him in a while.

I'm afraid your best bet is probably to email the YARN dev mailing list,
since this is a YARN config issue.

Cheers,
Chris

On 8/11/14 1:58 PM, "Telles Nobrega" <te...@gmail.com> wrote:

>​I exported ​export
>CLASSPATH=$CLASSPATH:hadoop-2.3.0/share/hadoop/hdfs/hadoop-hdfs-2.3.0.jar
>and still happened the same problem.
>
>
>On Mon, Aug 11, 2014 at 5:35 PM, Chris Riccomini <
>criccomini@linkedin.com.invalid> wrote:
>
>> Hey Telles,
>>
>> It sounds like either the HDFS jar is missing from the classpath, or the
>> hdfs file system needs to be configured:
>>
>> <property>
>>   <name>fs.hdfs.impl</name>
>>   <value>org.apache.hadoop.hdfs.DistributedFileSystem</value>
>>   <description>The FileSystem for hdfs: uris.</description>
>> </property>
>>
>>
>> (from
>> 
>>https://groups.google.com/a/cloudera.org/forum/#!topic/scm-users/lyho8ptA
>>zE
>> 0)
>>
>> I believe this will need to be configured for your NM.
>>
>> Cheers,
>> Chris
>>
>> On 8/11/14 1:31 PM, "Telles Nobrega" <te...@gmail.com> wrote:
>>
>> >Yes, it is like this:
>> >
>> ><configuration>
>> >  <property>
>> >    <name>dfs.datanode.data.dir</name>
>> >    <value>file:///home/ubuntu/hadoop-2.3.0/hdfs/datanode</value>
>> >    <description>Comma separated list of paths on the local filesystem
>>of
>> >a
>> >DataNode where it should store its blocks.</description>
>> >  </property>
>> >
>> >  <property>
>> >    <name>dfs.namenode.name.dir</name>
>> >    <value>file:///home/ubuntu/hadoop-2.3.0/hdfs/namenode</value>
>> >    <description>Path on the local filesystem where the NameNode stores
>> >the
>> >namespace and transaction logs persistently.</description>
>> >  </property>
>> ></configuration>
>> >~
>> >
>> >I saw some report that this may be a classpath problem. Does this
>>sounds
>> >right to you?
>> >
>> >
>> >On Mon, Aug 11, 2014 at 5:25 PM, Yan Fang <ya...@gmail.com> wrote:
>> >
>> >> Hi Telles,
>> >>
>> >> It looks correct. Did you put the hdfs-site.xml into your
>> >>HADOOP_CONF_DIR
>> >> ?(such as ~/.samza/conf)
>> >>
>> >> Fang, Yan
>> >> yanfang724@gmail.com
>> >> +1 (206) 849-4108
>> >>
>> >>
>> >> On Mon, Aug 11, 2014 at 1:02 PM, Telles Nobrega
>> >><te...@gmail.com>
>> >> wrote:
>> >>
>> >> > ​Hi Yan Fang,
>> >> >
>> >> > I was able to deploy the file to hdfs, I can see them in all my
>>nodes
>> >>but
>> >> > when I tried running I got this error:
>> >> >
>> >> > Exception in thread "main" java.io.IOException: No FileSystem for
>> >>scheme:
>> >> > hdfs
>> >> > at
>> >> 
>>org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2421)
>> >> >  at
>> >> 
>>org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428)
>> >> > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
>> >> >  at
>> >> 
>>org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
>> >> > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
>> >> >  at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
>> >> > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
>> >> >  at
>> >> >
>> >> >
>> >>
>> 
>>>>org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.s
>>>>ca
>> >>la:111)
>> >> > at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
>> >> >  at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
>> >> > at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
>> >> >  at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
>> >> > at org.apache.samza.job.JobRunner.main(JobRunner.scala)
>> >> >
>> >> >
>> >> > This is my yarn.package.path config:
>> >> >
>> >> >
>> >> >
>> >> >
>> >>
>> 
>>>>​yarn.package.path=hdfs://telles-master-samza:50070/samza-job-package-0
>>>>.7
>> >>.0-dist.tar.gz
>> >> >
>> >> > Thanks in advance
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > On Mon, Aug 11, 2014 at 3:00 PM, Yan Fang <ya...@gmail.com>
>> >>wrote:
>> >> >
>> >> > > Hi Telles,
>> >> > >
>> >> > > In terms of "*I tried pushing the tar file to HDFS but I got an
>> >>error
>> >> > from
>> >> > > hadoop saying that it couldn’t find core-site.xml file*.", I
>>guess
>> >>you
>> >> > set
>> >> > > the HADOOP_CONF_DIR variable and made it point to ~/.samza/conf.
>>You
>> >> can
>> >> > do
>> >> > > 1) make the HADOOP_CONF_DIR point to the directory where your
>>conf
>> >> files
>> >> > > are, such as /etc/hadoop/conf. Or 2) copy the config files to
>> >> > > ~/.samza/conf. Thank you,
>> >> > >
>> >> > > Cheer,
>> >> > >
>> >> > > Fang, Yan
>> >> > > yanfang724@gmail.com
>> >> > > +1 (206) 849-4108
>> >> > >
>> >> > >
>> >> > > On Mon, Aug 11, 2014 at 7:40 AM, Chris Riccomini <
>> >> > > criccomini@linkedin.com.invalid> wrote:
>> >> > >
>> >> > > > Hey Telles,
>> >> > > >
>> >> > > > To get YARN working with the HTTP file system, you need to
>>follow
>> >>the
>> >> > > > instructions on:
>> >> > > >
>> >> > > >
>> >> > >
>> >> >
>> >>
>> >>
>> 
>>http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-node
>> >>-y
>> >> > > > arn.html
>> >> > > >
>> >> > > >
>> >> > > > In the "Set Up Http Filesystem for YARN" section.
>> >> > > >
>> >> > > > You shouldn't need to compile anything (no Gradle, which is
>>what
>> >>your
>> >> > > > stack trace is showing). This setup should be done for all of
>>the
>> >> NMs,
>> >> > > > since they will be the ones downloading your job's package
>>(from
>> >> > > > yarn.package.path).
>> >> > > >
>> >> > > > Cheers,
>> >> > > > Chris
>> >> > > >
>> >> > > > On 8/9/14 9:44 PM, "Telles Nobrega" <te...@gmail.com>
>> >>wrote:
>> >> > > >
>> >> > > > >Hi again, I tried installing the scala libs but the Http
>>problem
>> >> still
>> >> > > > >occurs. I realised that I need to compile incubator samza in
>>the
>> >> > > machines
>> >> > > > >that I¹m going to run the jobs, but the compilation fails with
>> >>this
>> >> > huge
>> >> > > > >message:
>> >> > > > >
>> >> > > > >#
>> >> > > > ># There is insufficient memory for the Java Runtime
>>Environment
>> >>to
>> >> > > > >continue.
>> >> > > > ># Native memory allocation (malloc) failed to allocate
>>3946053632
>> >> > bytes
>> >> > > > >for committing reserved memory.
>> >> > > > ># An error report file with more information is saved as:
>> >> > > > ># /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2506.log
>> >> > > > >Could not write standard input into: Gradle Worker 13.
>> >> > > > >java.io.IOException: Broken pipe
>> >> > > > >       at java.io.FileOutputStream.writeBytes(Native Method)
>> >> > > > >       at
>> >>java.io.FileOutputStream.write(FileOutputStream.java:345)
>> >> > > > >       at
>> >> > > >
>> >> 
>>java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
>> >> > > > >       at
>> >> > > > 
>>java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOut
>>>>>pu
>> >>>tH
>> >> > > > >andleRunner.java:53)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecuto
>>>>>rI
>> >>>mp
>> >> > > > >l$1.run(DefaultExecutorFactory.java:66)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
>>>>>av
>> >>>a:
>> >> > > > >1145)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
>>>>>ja
>> >>>va
>> >> > > > >:615)
>> >> > > > >       at java.lang.Thread.run(Thread.java:744)
>> >> > > > >Process 'Gradle Worker 13' finished with non-zero exit value 1
>> >> > > > >org.gradle.process.internal.ExecException: Process 'Gradle
>>Worker
>> >> 13'
>> >> > > > >finished with non-zero exit value 1
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNor
>>>>>ma
>> >>>lE
>> >> > > > >xitValue(DefaultExecHandle.java:362)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(Default
>>>>>Wo
>> >>>rk
>> >> > > > >erProcess.java:89)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWor
>>>>>ke
>> >>>rP
>> >> > > > >rocess.java:33)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(D
>>>>>ef
>> >>>au
>> >> > > > >ltWorkerProcess.java:55)
>> >> > > > >       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>> >> Method)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
>>>>>av
>> >>>a:
>> >> > > > >57)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
>>>>>or
>> >>>Im
>> >> > > > >pl.java:43)
>> >> > > > >       at java.lang.reflect.Method.invoke(Method.java:606)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDi
>>>>>sp
>> >>>at
>> >> > > > >ch.java:35)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDi
>>>>>sp
>> >>>at
>> >> > > > >ch.java:24)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:
>>>>>81
>> >>>)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:
>>>>>30
>> >>>)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocati
>>>>>on
>> >>>Ha
>> >> > > > >ndler.invoke(ProxyDispatchAdapter.java:93)
>> >> > > > >       at com.sun.proxy.$Proxy46.executionFinished(Unknown
>> >>Source)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultE
>>>>>xe
>> >>>cH
>> >> > > > >andle.java:212)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHand
>>>>>le
>> >>>.j
>> >> > > > >ava:309)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunne
>>>>>r.
>> >>>ja
>> >> > > > >va:108)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java
>>>>>:8
>> >>>8)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecuto
>>>>>rI
>> >>>mp
>> >> > > > >l$1.run(DefaultExecutorFactory.java:66)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
>>>>>av
>> >>>a:
>> >> > > > >1145)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
>>>>>ja
>> >>>va
>> >> > > > >:615)
>> >> > > > >       at java.lang.Thread.run(Thread.java:744)
>> >> > > > >OpenJDK 64-Bit Server VM warning: INFO:
>> >> > > > >os::commit_memory(0x000000070a6c0000, 3946053632, 0) failed;
>> >> > > > >error='Cannot allocate memory' (errno=12)
>> >> > > > >#
>> >> > > > ># There is insufficient memory for the Java Runtime
>>Environment
>> >>to
>> >> > > > >continue.
>> >> > > > ># Native memory allocation (malloc) failed to allocate
>>3946053632
>> >> > bytes
>> >> > > > >for committing reserved memory.
>> >> > > > ># An error report file with more information is saved as:
>> >> > > > ># /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2518.log
>> >> > > > >Could not write standard input into: Gradle Worker 14.
>> >> > > > >java.io.IOException: Broken pipe
>> >> > > > >       at java.io.FileOutputStream.writeBytes(Native Method)
>> >> > > > >       at
>> >>java.io.FileOutputStream.write(FileOutputStream.java:345)
>> >> > > > >       at
>> >> > > >
>> >> 
>>java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
>> >> > > > >       at
>> >> > > > 
>>java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOut
>>>>>pu
>> >>>tH
>> >> > > > >andleRunner.java:53)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecuto
>>>>>rI
>> >>>mp
>> >> > > > >l$1.run(DefaultExecutorFactory.java:66)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
>>>>>av
>> >>>a:
>> >> > > > >1145)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
>>>>>ja
>> >>>va
>> >> > > > >:615)
>> >> > > > >       at java.lang.Thread.run(Thread.java:744)
>> >> > > > >Process 'Gradle Worker 14' finished with non-zero exit value 1
>> >> > > > >org.gradle.process.internal.ExecException: Process 'Gradle
>>Worker
>> >> 14'
>> >> > > > >finished with non-zero exit value 1
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNor
>>>>>ma
>> >>>lE
>> >> > > > >xitValue(DefaultExecHandle.java:362)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(Default
>>>>>Wo
>> >>>rk
>> >> > > > >erProcess.java:89)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWor
>>>>>ke
>> >>>rP
>> >> > > > >rocess.java:33)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(D
>>>>>ef
>> >>>au
>> >> > > > >ltWorkerProcess.java:55)
>> >> > > > >       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>> >> Method)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
>>>>>av
>> >>>a:
>> >> > > > >57)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
>>>>>or
>> >>>Im
>> >> > > > >pl.java:43)
>> >> > > > >       at java.lang.reflect.Method.invoke(Method.java:606)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDi
>>>>>sp
>> >>>at
>> >> > > > >ch.java:35)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDi
>>>>>sp
>> >>>at
>> >> > > > >ch.java:24)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:
>>>>>81
>> >>>)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:
>>>>>30
>> >>>)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocati
>>>>>on
>> >>>Ha
>> >> > > > >ndler.invoke(ProxyDispatchAdapter.java:93)
>> >> > > > >       at com.sun.proxy.$Proxy46.executionFinished(Unknown
>> >>Source)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultE
>>>>>xe
>> >>>cH
>> >> > > > >andle.java:212)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHand
>>>>>le
>> >>>.j
>> >> > > > >ava:309)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunne
>>>>>r.
>> >>>ja
>> >> > > > >va:108)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java
>>>>>:8
>> >>>8)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecuto
>>>>>rI
>> >>>mp
>> >> > > > >l$1.run(DefaultExecutorFactory.java:66)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
>>>>>av
>> >>>a:
>> >> > > > >1145)
>> >> > > > >       at
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
>>>>>ja
>> >>>va
>> >> > > > >:615)
>> >> > > > >       at java.lang.Thread.r
>> >> > > > >
>> >> > > > >Do I need more memory for my machines? Each already has 4GB. I
>> >> really
>> >> > > > >need to have this running. I¹m not sure which way is best
>>http or
>> >> hdfs
>> >> > > > >which one you suggest and how can i solve my problem for each
>> >>case.
>> >> > > > >
>> >> > > > >Thanks in advance and sorry for bothering this much.
>> >> > > > >On 10 Aug 2014, at 00:20, Telles Nobrega
>> >><te...@gmail.com>
>> >> > > wrote:
>> >> > > > >
>> >> > > > >> Hi Chris, now I have the tar file in my RM machine, and the
>> >>yarn
>> >> > path
>> >> > > > >>points to it. I changed the core-site.xml to use
>>HttpFileSystem
>> >> > instead
>> >> > > > >>of HDFS now it is failing with
>> >> > > > >>
>> >> > > > >> Application application_1407640485281_0001 failed 2 times
>>due
>> >>to
>> >> AM
>> >> > > > >>Container for appattempt_1407640485281_0001_000002 exited
>>with
>> >> > > > >>exitCode:-1000 due to: java.lang.ClassNotFoundException:
>>Class
>> >> > > > >>org.apache.samza.util.hadoop.HttpFileSystem not found
>> >> > > > >>
>> >> > > > >> I think I can solve this just installing scala files from
>>the
>> >> samza
>> >> > > > >>tutorial, can you confirm that?
>> >> > > > >>
>> >> > > > >> On 09 Aug 2014, at 08:34, Telles Nobrega
>> >><tellesnobrega@gmail.com
>> >> >
>> >> > > > >>wrote:
>> >> > > > >>
>> >> > > > >>> Hi Chris,
>> >> > > > >>>
>> >> > > > >>> I think the problem is that I forgot to update the
>> >> > yarn.job.package.
>> >> > > > >>> I will try again to see if it works now.
>> >> > > > >>>
>> >> > > > >>> I have one more question, how can I stop (command line) the
>> >>jobs
>> >> > > > >>>running in my topology, for the experiment that I will run,
>>I
>> >>need
>> >> > to
>> >> > > > >>>run the same job in 4 minutes intervals. So I need to kill
>>it,
>> >> clean
>> >> > > > >>>the kafka topics and rerun.
>> >> > > > >>>
>> >> > > > >>> Thanks in advance.
>> >> > > > >>>
>> >> > > > >>> On 08 Aug 2014, at 12:41, Chris Riccomini
>> >> > > > >>><cr...@linkedin.com.INVALID> wrote:
>> >> > > > >>>
>> >> > > > >>>> Hey Telles,
>> >> > > > >>>>
>> >> > > > >>>>>> Do I need to have the job folder on each machine in my
>> >> cluster?
>> >> > > > >>>>
>> >> > > > >>>> No, you should not need to do this. There are two ways to
>> >>deploy
>> >> > > your
>> >> > > > >>>> tarball to the YARN grid. One is to put it in HDFS, and
>>the
>> >> other
>> >> > is
>> >> > > > >>>>to
>> >> > > > >>>> put it on an HTTP server. The link to running a Samza job
>>in
>> >>a
>> >> > > > >>>>multi-node
>> >> > > > >>>> YARN cluster describes how to do both (either HTTP server
>>or
>> >> > HDFS).
>> >> > > > >>>>
>> >> > > > >>>> In both cases, once the tarball is put in on the HTTP/HDFS
>> >> > > server(s),
>> >> > > > >>>>you
>> >> > > > >>>> must update yarn.package.path to point to it. From there,
>>the
>> >> YARN
>> >> > > NM
>> >> > > > >>>> should download it for you automatically when you start
>>your
>> >> job.
>> >> > > > >>>>
>> >> > > > >>>> * Can you send along a paste of your job config?
>> >> > > > >>>>
>> >> > > > >>>> Cheers,
>> >> > > > >>>> Chris
>> >> > > > >>>>
>> >> > > > >>>> On 8/8/14 8:04 AM, "Claudio Martins"
>> >><cl...@mobileaware.com>
>> >> > > wrote:
>> >> > > > >>>>
>> >> > > > >>>>> Hi Telles, it looks to me that you forgot to update the
>> >> > > > >>>>> "yarn.package.path"
>> >> > > > >>>>> attribute in your config file for the task.
>> >> > > > >>>>>
>> >> > > > >>>>> - Claudio Martins
>> >> > > > >>>>> Head of Engineering
>> >> > > > >>>>> MobileAware USA Inc. / www.mobileaware.com
>> >> > > > >>>>> office: +1 617 986 5060 / mobile: +1 617 480 5288
>> >> > > > >>>>> linkedin: www.linkedin.com/in/martinsclaudio
>> >> > > > >>>>>
>> >> > > > >>>>>
>> >> > > > >>>>> On Fri, Aug 8, 2014 at 10:55 AM, Telles Nobrega
>> >> > > > >>>>><te...@gmail.com>
>> >> > > > >>>>> wrote:
>> >> > > > >>>>>
>> >> > > > >>>>>> Hi,
>> >> > > > >>>>>>
>> >> > > > >>>>>> this is my first time trying to run a job on a multinode
>> >> > > > >>>>>>environment. I
>> >> > > > >>>>>> have the cluster set up, I can see in the GUI that all
>> >>nodes
>> >> are
>> >> > > > >>>>>> working.
>> >> > > > >>>>>> Do I need to have the job folder on each machine in my
>> >> cluster?
>> >> > > > >>>>>> - The first time I tried running with the job on the
>> >>namenode
>> >> > > > >>>>>>machine
>> >> > > > >>>>>> and
>> >> > > > >>>>>> it failed saying:
>> >> > > > >>>>>>
>> >> > > > >>>>>> Application application_1407509228798_0001 failed 2
>>times
>> >>due
>> >> to
>> >> > > AM
>> >> > > > >>>>>> Container for appattempt_1407509228798_0001_000002
>>exited
>> >>with
>> >> > > > >>>>>>exitCode:
>> >> > > > >>>>>> -1000 due to: File
>> >> > > > >>>>>>
>> >> > > > >>>>>>
>> >> > > > >>>>>>
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>>>>>>file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-
>>>>>>>>>>pa
>> >>>>>>>>ck
>> >> > > > >>>>>>age-
>> >> > > > >>>>>> 0.7.0-dist.tar.gz
>> >> > > > >>>>>> does not exist
>> >> > > > >>>>>>
>> >> > > > >>>>>> So I copied the folder to each machine in my cluster and
>> >>got
>> >> > this
>> >> > > > >>>>>>error:
>> >> > > > >>>>>>
>> >> > > > >>>>>> Application application_1407509228798_0002 failed 2
>>times
>> >>due
>> >> to
>> >> > > AM
>> >> > > > >>>>>> Container for appattempt_1407509228798_0002_000002
>>exited
>> >>with
>> >> > > > >>>>>>exitCode:
>> >> > > > >>>>>> -1000 due to: Resource
>> >> > > > >>>>>>
>> >> > > > >>>>>>
>> >> > > > >>>>>>
>> >> > > >
>> >> > >
>> >> >
>> >>
>> 
>>>>>>>>>>file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-
>>>>>>>>>>pa
>> >>>>>>>>ck
>> >> > > > >>>>>>age-
>> >> > > > >>>>>> 0.7.0-dist.tar.gz
>> >> > > > >>>>>> changed on src filesystem (expected 1407509168000, was
>> >> > > 1407509434000
>> >> > > > >>>>>>
>> >> > > > >>>>>> What am I missing?
>> >> > > > >>>>>>
>> >> > > > >>>>>> p.s.: I followed this
>> >> > > > >>>>>>
>> >> > > > >>>>>><
>> >> > > >
>> >>https://github.com/yahoo/samoa/wiki/Executing-SAMOA-with-Apache-Samz
>> >> > > > >>>>>>a>
>> >> > > > >>>>>> tutorial
>> >> > > > >>>>>> and this
>> >> > > > >>>>>> <
>> >> > > > >>>>>>
>> >> > > > >>>>>>
>> >> > > > >>>>>>
>> >> > > >
>> >> http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-
>> >> > > > >>>>>>node
>> >> > > > >>>>>> -yarn.html
>> >> > > > >>>>>>>
>> >> > > > >>>>>> to
>> >> > > > >>>>>> set up the cluster.
>> >> > > > >>>>>>
>> >> > > > >>>>>> Help is much appreciated.
>> >> > > > >>>>>>
>> >> > > > >>>>>> Thanks in advance.
>> >> > > > >>>>>>
>> >> > > > >>>>>> --
>> >> > > > >>>>>> ------------------------------------------
>> >> > > > >>>>>> Telles Mota Vidal Nobrega
>> >> > > > >>>>>> M.sc. Candidate at UFCG
>> >> > > > >>>>>> B.sc. in Computer Science at UFCG
>> >> > > > >>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
>> >> > > > >>>>>>
>> >> > > > >>>>
>> >> > > > >>>
>> >> > > > >>
>> >> > > > >
>> >> > > >
>> >> > > >
>> >> > >
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > ------------------------------------------
>> >> > Telles Mota Vidal Nobrega
>> >> > M.sc. Candidate at UFCG
>> >> > B.sc. in Computer Science at UFCG
>> >> > Software Engineer at OpenStack Project - HP/LSD-UFCG
>> >> >
>> >>
>> >
>> >
>> >
>> >--
>> >------------------------------------------
>> >Telles Mota Vidal Nobrega
>> >M.sc. Candidate at UFCG
>> >B.sc. in Computer Science at UFCG
>> >Software Engineer at OpenStack Project - HP/LSD-UFCG
>>
>>
>
>
>-- 
>------------------------------------------
>Telles Mota Vidal Nobrega
>M.sc. Candidate at UFCG
>B.sc. in Computer Science at UFCG
>Software Engineer at OpenStack Project - HP/LSD-UFCG


Re: Running Job on Multinode Yarn Cluster

Posted by Telles Nobrega <te...@gmail.com>.
​I exported ​export
CLASSPATH=$CLASSPATH:hadoop-2.3.0/share/hadoop/hdfs/hadoop-hdfs-2.3.0.jar
and still happened the same problem.


On Mon, Aug 11, 2014 at 5:35 PM, Chris Riccomini <
criccomini@linkedin.com.invalid> wrote:

> Hey Telles,
>
> It sounds like either the HDFS jar is missing from the classpath, or the
> hdfs file system needs to be configured:
>
> <property>
>   <name>fs.hdfs.impl</name>
>   <value>org.apache.hadoop.hdfs.DistributedFileSystem</value>
>   <description>The FileSystem for hdfs: uris.</description>
> </property>
>
>
> (from
> https://groups.google.com/a/cloudera.org/forum/#!topic/scm-users/lyho8ptAzE
> 0)
>
> I believe this will need to be configured for your NM.
>
> Cheers,
> Chris
>
> On 8/11/14 1:31 PM, "Telles Nobrega" <te...@gmail.com> wrote:
>
> >Yes, it is like this:
> >
> ><configuration>
> >  <property>
> >    <name>dfs.datanode.data.dir</name>
> >    <value>file:///home/ubuntu/hadoop-2.3.0/hdfs/datanode</value>
> >    <description>Comma separated list of paths on the local filesystem of
> >a
> >DataNode where it should store its blocks.</description>
> >  </property>
> >
> >  <property>
> >    <name>dfs.namenode.name.dir</name>
> >    <value>file:///home/ubuntu/hadoop-2.3.0/hdfs/namenode</value>
> >    <description>Path on the local filesystem where the NameNode stores
> >the
> >namespace and transaction logs persistently.</description>
> >  </property>
> ></configuration>
> >~
> >
> >I saw some report that this may be a classpath problem. Does this sounds
> >right to you?
> >
> >
> >On Mon, Aug 11, 2014 at 5:25 PM, Yan Fang <ya...@gmail.com> wrote:
> >
> >> Hi Telles,
> >>
> >> It looks correct. Did you put the hdfs-site.xml into your
> >>HADOOP_CONF_DIR
> >> ?(such as ~/.samza/conf)
> >>
> >> Fang, Yan
> >> yanfang724@gmail.com
> >> +1 (206) 849-4108
> >>
> >>
> >> On Mon, Aug 11, 2014 at 1:02 PM, Telles Nobrega
> >><te...@gmail.com>
> >> wrote:
> >>
> >> > ​Hi Yan Fang,
> >> >
> >> > I was able to deploy the file to hdfs, I can see them in all my nodes
> >>but
> >> > when I tried running I got this error:
> >> >
> >> > Exception in thread "main" java.io.IOException: No FileSystem for
> >>scheme:
> >> > hdfs
> >> > at
> >> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2421)
> >> >  at
> >> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428)
> >> > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
> >> >  at
> >> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
> >> > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
> >> >  at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
> >> > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
> >> >  at
> >> >
> >> >
> >>
> >>org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.sca
> >>la:111)
> >> > at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
> >> >  at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
> >> > at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
> >> >  at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
> >> > at org.apache.samza.job.JobRunner.main(JobRunner.scala)
> >> >
> >> >
> >> > This is my yarn.package.path config:
> >> >
> >> >
> >> >
> >> >
> >>
> >>​yarn.package.path=hdfs://telles-master-samza:50070/samza-job-package-0.7
> >>.0-dist.tar.gz
> >> >
> >> > Thanks in advance
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > On Mon, Aug 11, 2014 at 3:00 PM, Yan Fang <ya...@gmail.com>
> >>wrote:
> >> >
> >> > > Hi Telles,
> >> > >
> >> > > In terms of "*I tried pushing the tar file to HDFS but I got an
> >>error
> >> > from
> >> > > hadoop saying that it couldn’t find core-site.xml file*.", I guess
> >>you
> >> > set
> >> > > the HADOOP_CONF_DIR variable and made it point to ~/.samza/conf. You
> >> can
> >> > do
> >> > > 1) make the HADOOP_CONF_DIR point to the directory where your conf
> >> files
> >> > > are, such as /etc/hadoop/conf. Or 2) copy the config files to
> >> > > ~/.samza/conf. Thank you,
> >> > >
> >> > > Cheer,
> >> > >
> >> > > Fang, Yan
> >> > > yanfang724@gmail.com
> >> > > +1 (206) 849-4108
> >> > >
> >> > >
> >> > > On Mon, Aug 11, 2014 at 7:40 AM, Chris Riccomini <
> >> > > criccomini@linkedin.com.invalid> wrote:
> >> > >
> >> > > > Hey Telles,
> >> > > >
> >> > > > To get YARN working with the HTTP file system, you need to follow
> >>the
> >> > > > instructions on:
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> >>
> http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-node
> >>-y
> >> > > > arn.html
> >> > > >
> >> > > >
> >> > > > In the "Set Up Http Filesystem for YARN" section.
> >> > > >
> >> > > > You shouldn't need to compile anything (no Gradle, which is what
> >>your
> >> > > > stack trace is showing). This setup should be done for all of the
> >> NMs,
> >> > > > since they will be the ones downloading your job's package (from
> >> > > > yarn.package.path).
> >> > > >
> >> > > > Cheers,
> >> > > > Chris
> >> > > >
> >> > > > On 8/9/14 9:44 PM, "Telles Nobrega" <te...@gmail.com>
> >>wrote:
> >> > > >
> >> > > > >Hi again, I tried installing the scala libs but the Http problem
> >> still
> >> > > > >occurs. I realised that I need to compile incubator samza in the
> >> > > machines
> >> > > > >that I¹m going to run the jobs, but the compilation fails with
> >>this
> >> > huge
> >> > > > >message:
> >> > > > >
> >> > > > >#
> >> > > > ># There is insufficient memory for the Java Runtime Environment
> >>to
> >> > > > >continue.
> >> > > > ># Native memory allocation (malloc) failed to allocate 3946053632
> >> > bytes
> >> > > > >for committing reserved memory.
> >> > > > ># An error report file with more information is saved as:
> >> > > > ># /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2506.log
> >> > > > >Could not write standard input into: Gradle Worker 13.
> >> > > > >java.io.IOException: Broken pipe
> >> > > > >       at java.io.FileOutputStream.writeBytes(Native Method)
> >> > > > >       at
> >>java.io.FileOutputStream.write(FileOutputStream.java:345)
> >> > > > >       at
> >> > > >
> >> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> >> > > > >       at
> >> > > > java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOutpu
> >>>tH
> >> > > > >andleRunner.java:53)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorI
> >>>mp
> >> > > > >l$1.run(DefaultExecutorFactory.java:66)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.jav
> >>>a:
> >> > > > >1145)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
> >>>va
> >> > > > >:615)
> >> > > > >       at java.lang.Thread.run(Thread.java:744)
> >> > > > >Process 'Gradle Worker 13' finished with non-zero exit value 1
> >> > > > >org.gradle.process.internal.ExecException: Process 'Gradle Worker
> >> 13'
> >> > > > >finished with non-zero exit value 1
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNorma
> >>>lE
> >> > > > >xitValue(DefaultExecHandle.java:362)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(DefaultWo
> >>>rk
> >> > > > >erProcess.java:89)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWorke
> >>>rP
> >> > > > >rocess.java:33)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(Def
> >>>au
> >> > > > >ltWorkerProcess.java:55)
> >> > > > >       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> >> Method)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav
> >>>a:
> >> > > > >57)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
> >>>Im
> >> > > > >pl.java:43)
> >> > > > >       at java.lang.reflect.Method.invoke(Method.java:606)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDisp
> >>>at
> >> > > > >ch.java:35)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDisp
> >>>at
> >> > > > >ch.java:24)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:81
> >>>)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:30
> >>>)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocation
> >>>Ha
> >> > > > >ndler.invoke(ProxyDispatchAdapter.java:93)
> >> > > > >       at com.sun.proxy.$Proxy46.executionFinished(Unknown
> >>Source)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultExe
> >>>cH
> >> > > > >andle.java:212)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHandle
> >>>.j
> >> > > > >ava:309)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunner.
> >>>ja
> >> > > > >va:108)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java:8
> >>>8)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorI
> >>>mp
> >> > > > >l$1.run(DefaultExecutorFactory.java:66)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.jav
> >>>a:
> >> > > > >1145)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
> >>>va
> >> > > > >:615)
> >> > > > >       at java.lang.Thread.run(Thread.java:744)
> >> > > > >OpenJDK 64-Bit Server VM warning: INFO:
> >> > > > >os::commit_memory(0x000000070a6c0000, 3946053632, 0) failed;
> >> > > > >error='Cannot allocate memory' (errno=12)
> >> > > > >#
> >> > > > ># There is insufficient memory for the Java Runtime Environment
> >>to
> >> > > > >continue.
> >> > > > ># Native memory allocation (malloc) failed to allocate 3946053632
> >> > bytes
> >> > > > >for committing reserved memory.
> >> > > > ># An error report file with more information is saved as:
> >> > > > ># /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2518.log
> >> > > > >Could not write standard input into: Gradle Worker 14.
> >> > > > >java.io.IOException: Broken pipe
> >> > > > >       at java.io.FileOutputStream.writeBytes(Native Method)
> >> > > > >       at
> >>java.io.FileOutputStream.write(FileOutputStream.java:345)
> >> > > > >       at
> >> > > >
> >> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> >> > > > >       at
> >> > > > java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOutpu
> >>>tH
> >> > > > >andleRunner.java:53)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorI
> >>>mp
> >> > > > >l$1.run(DefaultExecutorFactory.java:66)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.jav
> >>>a:
> >> > > > >1145)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
> >>>va
> >> > > > >:615)
> >> > > > >       at java.lang.Thread.run(Thread.java:744)
> >> > > > >Process 'Gradle Worker 14' finished with non-zero exit value 1
> >> > > > >org.gradle.process.internal.ExecException: Process 'Gradle Worker
> >> 14'
> >> > > > >finished with non-zero exit value 1
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNorma
> >>>lE
> >> > > > >xitValue(DefaultExecHandle.java:362)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(DefaultWo
> >>>rk
> >> > > > >erProcess.java:89)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWorke
> >>>rP
> >> > > > >rocess.java:33)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(Def
> >>>au
> >> > > > >ltWorkerProcess.java:55)
> >> > > > >       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> >> Method)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav
> >>>a:
> >> > > > >57)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
> >>>Im
> >> > > > >pl.java:43)
> >> > > > >       at java.lang.reflect.Method.invoke(Method.java:606)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDisp
> >>>at
> >> > > > >ch.java:35)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDisp
> >>>at
> >> > > > >ch.java:24)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:81
> >>>)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:30
> >>>)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocation
> >>>Ha
> >> > > > >ndler.invoke(ProxyDispatchAdapter.java:93)
> >> > > > >       at com.sun.proxy.$Proxy46.executionFinished(Unknown
> >>Source)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultExe
> >>>cH
> >> > > > >andle.java:212)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHandle
> >>>.j
> >> > > > >ava:309)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunner.
> >>>ja
> >> > > > >va:108)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java:8
> >>>8)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorI
> >>>mp
> >> > > > >l$1.run(DefaultExecutorFactory.java:66)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.jav
> >>>a:
> >> > > > >1145)
> >> > > > >       at
> >> > > >
> >> > >
> >> >
> >>
> >>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
> >>>va
> >> > > > >:615)
> >> > > > >       at java.lang.Thread.r
> >> > > > >
> >> > > > >Do I need more memory for my machines? Each already has 4GB. I
> >> really
> >> > > > >need to have this running. I¹m not sure which way is best http or
> >> hdfs
> >> > > > >which one you suggest and how can i solve my problem for each
> >>case.
> >> > > > >
> >> > > > >Thanks in advance and sorry for bothering this much.
> >> > > > >On 10 Aug 2014, at 00:20, Telles Nobrega
> >><te...@gmail.com>
> >> > > wrote:
> >> > > > >
> >> > > > >> Hi Chris, now I have the tar file in my RM machine, and the
> >>yarn
> >> > path
> >> > > > >>points to it. I changed the core-site.xml to use HttpFileSystem
> >> > instead
> >> > > > >>of HDFS now it is failing with
> >> > > > >>
> >> > > > >> Application application_1407640485281_0001 failed 2 times due
> >>to
> >> AM
> >> > > > >>Container for appattempt_1407640485281_0001_000002 exited with
> >> > > > >>exitCode:-1000 due to: java.lang.ClassNotFoundException: Class
> >> > > > >>org.apache.samza.util.hadoop.HttpFileSystem not found
> >> > > > >>
> >> > > > >> I think I can solve this just installing scala files from the
> >> samza
> >> > > > >>tutorial, can you confirm that?
> >> > > > >>
> >> > > > >> On 09 Aug 2014, at 08:34, Telles Nobrega
> >><tellesnobrega@gmail.com
> >> >
> >> > > > >>wrote:
> >> > > > >>
> >> > > > >>> Hi Chris,
> >> > > > >>>
> >> > > > >>> I think the problem is that I forgot to update the
> >> > yarn.job.package.
> >> > > > >>> I will try again to see if it works now.
> >> > > > >>>
> >> > > > >>> I have one more question, how can I stop (command line) the
> >>jobs
> >> > > > >>>running in my topology, for the experiment that I will run, I
> >>need
> >> > to
> >> > > > >>>run the same job in 4 minutes intervals. So I need to kill it,
> >> clean
> >> > > > >>>the kafka topics and rerun.
> >> > > > >>>
> >> > > > >>> Thanks in advance.
> >> > > > >>>
> >> > > > >>> On 08 Aug 2014, at 12:41, Chris Riccomini
> >> > > > >>><cr...@linkedin.com.INVALID> wrote:
> >> > > > >>>
> >> > > > >>>> Hey Telles,
> >> > > > >>>>
> >> > > > >>>>>> Do I need to have the job folder on each machine in my
> >> cluster?
> >> > > > >>>>
> >> > > > >>>> No, you should not need to do this. There are two ways to
> >>deploy
> >> > > your
> >> > > > >>>> tarball to the YARN grid. One is to put it in HDFS, and the
> >> other
> >> > is
> >> > > > >>>>to
> >> > > > >>>> put it on an HTTP server. The link to running a Samza job in
> >>a
> >> > > > >>>>multi-node
> >> > > > >>>> YARN cluster describes how to do both (either HTTP server or
> >> > HDFS).
> >> > > > >>>>
> >> > > > >>>> In both cases, once the tarball is put in on the HTTP/HDFS
> >> > > server(s),
> >> > > > >>>>you
> >> > > > >>>> must update yarn.package.path to point to it. From there, the
> >> YARN
> >> > > NM
> >> > > > >>>> should download it for you automatically when you start your
> >> job.
> >> > > > >>>>
> >> > > > >>>> * Can you send along a paste of your job config?
> >> > > > >>>>
> >> > > > >>>> Cheers,
> >> > > > >>>> Chris
> >> > > > >>>>
> >> > > > >>>> On 8/8/14 8:04 AM, "Claudio Martins"
> >><cl...@mobileaware.com>
> >> > > wrote:
> >> > > > >>>>
> >> > > > >>>>> Hi Telles, it looks to me that you forgot to update the
> >> > > > >>>>> "yarn.package.path"
> >> > > > >>>>> attribute in your config file for the task.
> >> > > > >>>>>
> >> > > > >>>>> - Claudio Martins
> >> > > > >>>>> Head of Engineering
> >> > > > >>>>> MobileAware USA Inc. / www.mobileaware.com
> >> > > > >>>>> office: +1 617 986 5060 / mobile: +1 617 480 5288
> >> > > > >>>>> linkedin: www.linkedin.com/in/martinsclaudio
> >> > > > >>>>>
> >> > > > >>>>>
> >> > > > >>>>> On Fri, Aug 8, 2014 at 10:55 AM, Telles Nobrega
> >> > > > >>>>><te...@gmail.com>
> >> > > > >>>>> wrote:
> >> > > > >>>>>
> >> > > > >>>>>> Hi,
> >> > > > >>>>>>
> >> > > > >>>>>> this is my first time trying to run a job on a multinode
> >> > > > >>>>>>environment. I
> >> > > > >>>>>> have the cluster set up, I can see in the GUI that all
> >>nodes
> >> are
> >> > > > >>>>>> working.
> >> > > > >>>>>> Do I need to have the job folder on each machine in my
> >> cluster?
> >> > > > >>>>>> - The first time I tried running with the job on the
> >>namenode
> >> > > > >>>>>>machine
> >> > > > >>>>>> and
> >> > > > >>>>>> it failed saying:
> >> > > > >>>>>>
> >> > > > >>>>>> Application application_1407509228798_0001 failed 2 times
> >>due
> >> to
> >> > > AM
> >> > > > >>>>>> Container for appattempt_1407509228798_0001_000002 exited
> >>with
> >> > > > >>>>>>exitCode:
> >> > > > >>>>>> -1000 due to: File
> >> > > > >>>>>>
> >> > > > >>>>>>
> >> > > > >>>>>>
> >> > > >
> >> > >
> >> >
> >>
> >>>>>>>>file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-pa
> >>>>>>>>ck
> >> > > > >>>>>>age-
> >> > > > >>>>>> 0.7.0-dist.tar.gz
> >> > > > >>>>>> does not exist
> >> > > > >>>>>>
> >> > > > >>>>>> So I copied the folder to each machine in my cluster and
> >>got
> >> > this
> >> > > > >>>>>>error:
> >> > > > >>>>>>
> >> > > > >>>>>> Application application_1407509228798_0002 failed 2 times
> >>due
> >> to
> >> > > AM
> >> > > > >>>>>> Container for appattempt_1407509228798_0002_000002 exited
> >>with
> >> > > > >>>>>>exitCode:
> >> > > > >>>>>> -1000 due to: Resource
> >> > > > >>>>>>
> >> > > > >>>>>>
> >> > > > >>>>>>
> >> > > >
> >> > >
> >> >
> >>
> >>>>>>>>file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-pa
> >>>>>>>>ck
> >> > > > >>>>>>age-
> >> > > > >>>>>> 0.7.0-dist.tar.gz
> >> > > > >>>>>> changed on src filesystem (expected 1407509168000, was
> >> > > 1407509434000
> >> > > > >>>>>>
> >> > > > >>>>>> What am I missing?
> >> > > > >>>>>>
> >> > > > >>>>>> p.s.: I followed this
> >> > > > >>>>>>
> >> > > > >>>>>><
> >> > > >
> >>https://github.com/yahoo/samoa/wiki/Executing-SAMOA-with-Apache-Samz
> >> > > > >>>>>>a>
> >> > > > >>>>>> tutorial
> >> > > > >>>>>> and this
> >> > > > >>>>>> <
> >> > > > >>>>>>
> >> > > > >>>>>>
> >> > > > >>>>>>
> >> > > >
> >> http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-
> >> > > > >>>>>>node
> >> > > > >>>>>> -yarn.html
> >> > > > >>>>>>>
> >> > > > >>>>>> to
> >> > > > >>>>>> set up the cluster.
> >> > > > >>>>>>
> >> > > > >>>>>> Help is much appreciated.
> >> > > > >>>>>>
> >> > > > >>>>>> Thanks in advance.
> >> > > > >>>>>>
> >> > > > >>>>>> --
> >> > > > >>>>>> ------------------------------------------
> >> > > > >>>>>> Telles Mota Vidal Nobrega
> >> > > > >>>>>> M.sc. Candidate at UFCG
> >> > > > >>>>>> B.sc. in Computer Science at UFCG
> >> > > > >>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
> >> > > > >>>>>>
> >> > > > >>>>
> >> > > > >>>
> >> > > > >>
> >> > > > >
> >> > > >
> >> > > >
> >> > >
> >> >
> >> >
> >> >
> >> > --
> >> > ------------------------------------------
> >> > Telles Mota Vidal Nobrega
> >> > M.sc. Candidate at UFCG
> >> > B.sc. in Computer Science at UFCG
> >> > Software Engineer at OpenStack Project - HP/LSD-UFCG
> >> >
> >>
> >
> >
> >
> >--
> >------------------------------------------
> >Telles Mota Vidal Nobrega
> >M.sc. Candidate at UFCG
> >B.sc. in Computer Science at UFCG
> >Software Engineer at OpenStack Project - HP/LSD-UFCG
>
>


-- 
------------------------------------------
Telles Mota Vidal Nobrega
M.sc. Candidate at UFCG
B.sc. in Computer Science at UFCG
Software Engineer at OpenStack Project - HP/LSD-UFCG

Re: Running Job on Multinode Yarn Cluster

Posted by Chris Riccomini <cr...@linkedin.com.INVALID>.
Hey Telles,

It sounds like either the HDFS jar is missing from the classpath, or the
hdfs file system needs to be configured:

<property>
  <name>fs.hdfs.impl</name>
  <value>org.apache.hadoop.hdfs.DistributedFileSystem</value>
  <description>The FileSystem for hdfs: uris.</description>
</property>


(from 
https://groups.google.com/a/cloudera.org/forum/#!topic/scm-users/lyho8ptAzE
0)

I believe this will need to be configured for your NM.

Cheers,
Chris

On 8/11/14 1:31 PM, "Telles Nobrega" <te...@gmail.com> wrote:

>Yes, it is like this:
>
><configuration>
>  <property>
>    <name>dfs.datanode.data.dir</name>
>    <value>file:///home/ubuntu/hadoop-2.3.0/hdfs/datanode</value>
>    <description>Comma separated list of paths on the local filesystem of
>a
>DataNode where it should store its blocks.</description>
>  </property>
>
>  <property>
>    <name>dfs.namenode.name.dir</name>
>    <value>file:///home/ubuntu/hadoop-2.3.0/hdfs/namenode</value>
>    <description>Path on the local filesystem where the NameNode stores
>the
>namespace and transaction logs persistently.</description>
>  </property>
></configuration>
>~
>
>I saw some report that this may be a classpath problem. Does this sounds
>right to you?
>
>
>On Mon, Aug 11, 2014 at 5:25 PM, Yan Fang <ya...@gmail.com> wrote:
>
>> Hi Telles,
>>
>> It looks correct. Did you put the hdfs-site.xml into your
>>HADOOP_CONF_DIR
>> ?(such as ~/.samza/conf)
>>
>> Fang, Yan
>> yanfang724@gmail.com
>> +1 (206) 849-4108
>>
>>
>> On Mon, Aug 11, 2014 at 1:02 PM, Telles Nobrega
>><te...@gmail.com>
>> wrote:
>>
>> > ​Hi Yan Fang,
>> >
>> > I was able to deploy the file to hdfs, I can see them in all my nodes
>>but
>> > when I tried running I got this error:
>> >
>> > Exception in thread "main" java.io.IOException: No FileSystem for
>>scheme:
>> > hdfs
>> > at
>> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2421)
>> >  at
>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428)
>> > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
>> >  at
>> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
>> > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
>> >  at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
>> > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
>> >  at
>> >
>> >
>> 
>>org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.sca
>>la:111)
>> > at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
>> >  at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
>> > at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
>> >  at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
>> > at org.apache.samza.job.JobRunner.main(JobRunner.scala)
>> >
>> >
>> > This is my yarn.package.path config:
>> >
>> >
>> >
>> >
>>  
>>​yarn.package.path=hdfs://telles-master-samza:50070/samza-job-package-0.7
>>.0-dist.tar.gz
>> >
>> > Thanks in advance
>> >
>> >
>> >
>> >
>> >
>> > On Mon, Aug 11, 2014 at 3:00 PM, Yan Fang <ya...@gmail.com>
>>wrote:
>> >
>> > > Hi Telles,
>> > >
>> > > In terms of "*I tried pushing the tar file to HDFS but I got an
>>error
>> > from
>> > > hadoop saying that it couldn’t find core-site.xml file*.", I guess
>>you
>> > set
>> > > the HADOOP_CONF_DIR variable and made it point to ~/.samza/conf. You
>> can
>> > do
>> > > 1) make the HADOOP_CONF_DIR point to the directory where your conf
>> files
>> > > are, such as /etc/hadoop/conf. Or 2) copy the config files to
>> > > ~/.samza/conf. Thank you,
>> > >
>> > > Cheer,
>> > >
>> > > Fang, Yan
>> > > yanfang724@gmail.com
>> > > +1 (206) 849-4108
>> > >
>> > >
>> > > On Mon, Aug 11, 2014 at 7:40 AM, Chris Riccomini <
>> > > criccomini@linkedin.com.invalid> wrote:
>> > >
>> > > > Hey Telles,
>> > > >
>> > > > To get YARN working with the HTTP file system, you need to follow
>>the
>> > > > instructions on:
>> > > >
>> > > >
>> > >
>> >
>> 
>>http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-node
>>-y
>> > > > arn.html
>> > > >
>> > > >
>> > > > In the "Set Up Http Filesystem for YARN" section.
>> > > >
>> > > > You shouldn't need to compile anything (no Gradle, which is what
>>your
>> > > > stack trace is showing). This setup should be done for all of the
>> NMs,
>> > > > since they will be the ones downloading your job's package (from
>> > > > yarn.package.path).
>> > > >
>> > > > Cheers,
>> > > > Chris
>> > > >
>> > > > On 8/9/14 9:44 PM, "Telles Nobrega" <te...@gmail.com>
>>wrote:
>> > > >
>> > > > >Hi again, I tried installing the scala libs but the Http problem
>> still
>> > > > >occurs. I realised that I need to compile incubator samza in the
>> > > machines
>> > > > >that I¹m going to run the jobs, but the compilation fails with
>>this
>> > huge
>> > > > >message:
>> > > > >
>> > > > >#
>> > > > ># There is insufficient memory for the Java Runtime Environment
>>to
>> > > > >continue.
>> > > > ># Native memory allocation (malloc) failed to allocate 3946053632
>> > bytes
>> > > > >for committing reserved memory.
>> > > > ># An error report file with more information is saved as:
>> > > > ># /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2506.log
>> > > > >Could not write standard input into: Gradle Worker 13.
>> > > > >java.io.IOException: Broken pipe
>> > > > >       at java.io.FileOutputStream.writeBytes(Native Method)
>> > > > >       at
>>java.io.FileOutputStream.write(FileOutputStream.java:345)
>> > > > >       at
>> > > >
>> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
>> > > > >       at
>> > > > java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOutpu
>>>tH
>> > > > >andleRunner.java:53)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorI
>>>mp
>> > > > >l$1.run(DefaultExecutorFactory.java:66)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.jav
>>>a:
>> > > > >1145)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
>>>va
>> > > > >:615)
>> > > > >       at java.lang.Thread.run(Thread.java:744)
>> > > > >Process 'Gradle Worker 13' finished with non-zero exit value 1
>> > > > >org.gradle.process.internal.ExecException: Process 'Gradle Worker
>> 13'
>> > > > >finished with non-zero exit value 1
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNorma
>>>lE
>> > > > >xitValue(DefaultExecHandle.java:362)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(DefaultWo
>>>rk
>> > > > >erProcess.java:89)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWorke
>>>rP
>> > > > >rocess.java:33)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(Def
>>>au
>> > > > >ltWorkerProcess.java:55)
>> > > > >       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>> Method)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav
>>>a:
>> > > > >57)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
>>>Im
>> > > > >pl.java:43)
>> > > > >       at java.lang.reflect.Method.invoke(Method.java:606)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDisp
>>>at
>> > > > >ch.java:35)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDisp
>>>at
>> > > > >ch.java:24)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:81
>>>)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:30
>>>)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocation
>>>Ha
>> > > > >ndler.invoke(ProxyDispatchAdapter.java:93)
>> > > > >       at com.sun.proxy.$Proxy46.executionFinished(Unknown
>>Source)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultExe
>>>cH
>> > > > >andle.java:212)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHandle
>>>.j
>> > > > >ava:309)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunner.
>>>ja
>> > > > >va:108)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java:8
>>>8)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorI
>>>mp
>> > > > >l$1.run(DefaultExecutorFactory.java:66)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.jav
>>>a:
>> > > > >1145)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
>>>va
>> > > > >:615)
>> > > > >       at java.lang.Thread.run(Thread.java:744)
>> > > > >OpenJDK 64-Bit Server VM warning: INFO:
>> > > > >os::commit_memory(0x000000070a6c0000, 3946053632, 0) failed;
>> > > > >error='Cannot allocate memory' (errno=12)
>> > > > >#
>> > > > ># There is insufficient memory for the Java Runtime Environment
>>to
>> > > > >continue.
>> > > > ># Native memory allocation (malloc) failed to allocate 3946053632
>> > bytes
>> > > > >for committing reserved memory.
>> > > > ># An error report file with more information is saved as:
>> > > > ># /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2518.log
>> > > > >Could not write standard input into: Gradle Worker 14.
>> > > > >java.io.IOException: Broken pipe
>> > > > >       at java.io.FileOutputStream.writeBytes(Native Method)
>> > > > >       at
>>java.io.FileOutputStream.write(FileOutputStream.java:345)
>> > > > >       at
>> > > >
>> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
>> > > > >       at
>> > > > java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOutpu
>>>tH
>> > > > >andleRunner.java:53)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorI
>>>mp
>> > > > >l$1.run(DefaultExecutorFactory.java:66)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.jav
>>>a:
>> > > > >1145)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
>>>va
>> > > > >:615)
>> > > > >       at java.lang.Thread.run(Thread.java:744)
>> > > > >Process 'Gradle Worker 14' finished with non-zero exit value 1
>> > > > >org.gradle.process.internal.ExecException: Process 'Gradle Worker
>> 14'
>> > > > >finished with non-zero exit value 1
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNorma
>>>lE
>> > > > >xitValue(DefaultExecHandle.java:362)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(DefaultWo
>>>rk
>> > > > >erProcess.java:89)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWorke
>>>rP
>> > > > >rocess.java:33)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(Def
>>>au
>> > > > >ltWorkerProcess.java:55)
>> > > > >       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>> Method)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav
>>>a:
>> > > > >57)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
>>>Im
>> > > > >pl.java:43)
>> > > > >       at java.lang.reflect.Method.invoke(Method.java:606)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDisp
>>>at
>> > > > >ch.java:35)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDisp
>>>at
>> > > > >ch.java:24)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:81
>>>)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:30
>>>)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocation
>>>Ha
>> > > > >ndler.invoke(ProxyDispatchAdapter.java:93)
>> > > > >       at com.sun.proxy.$Proxy46.executionFinished(Unknown
>>Source)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultExe
>>>cH
>> > > > >andle.java:212)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHandle
>>>.j
>> > > > >ava:309)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunner.
>>>ja
>> > > > >va:108)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java:8
>>>8)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorI
>>>mp
>> > > > >l$1.run(DefaultExecutorFactory.java:66)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.jav
>>>a:
>> > > > >1145)
>> > > > >       at
>> > > >
>> > >
>> >
>> 
>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
>>>va
>> > > > >:615)
>> > > > >       at java.lang.Thread.r
>> > > > >
>> > > > >Do I need more memory for my machines? Each already has 4GB. I
>> really
>> > > > >need to have this running. I¹m not sure which way is best http or
>> hdfs
>> > > > >which one you suggest and how can i solve my problem for each
>>case.
>> > > > >
>> > > > >Thanks in advance and sorry for bothering this much.
>> > > > >On 10 Aug 2014, at 00:20, Telles Nobrega
>><te...@gmail.com>
>> > > wrote:
>> > > > >
>> > > > >> Hi Chris, now I have the tar file in my RM machine, and the
>>yarn
>> > path
>> > > > >>points to it. I changed the core-site.xml to use HttpFileSystem
>> > instead
>> > > > >>of HDFS now it is failing with
>> > > > >>
>> > > > >> Application application_1407640485281_0001 failed 2 times due
>>to
>> AM
>> > > > >>Container for appattempt_1407640485281_0001_000002 exited with
>> > > > >>exitCode:-1000 due to: java.lang.ClassNotFoundException: Class
>> > > > >>org.apache.samza.util.hadoop.HttpFileSystem not found
>> > > > >>
>> > > > >> I think I can solve this just installing scala files from the
>> samza
>> > > > >>tutorial, can you confirm that?
>> > > > >>
>> > > > >> On 09 Aug 2014, at 08:34, Telles Nobrega
>><tellesnobrega@gmail.com
>> >
>> > > > >>wrote:
>> > > > >>
>> > > > >>> Hi Chris,
>> > > > >>>
>> > > > >>> I think the problem is that I forgot to update the
>> > yarn.job.package.
>> > > > >>> I will try again to see if it works now.
>> > > > >>>
>> > > > >>> I have one more question, how can I stop (command line) the
>>jobs
>> > > > >>>running in my topology, for the experiment that I will run, I
>>need
>> > to
>> > > > >>>run the same job in 4 minutes intervals. So I need to kill it,
>> clean
>> > > > >>>the kafka topics and rerun.
>> > > > >>>
>> > > > >>> Thanks in advance.
>> > > > >>>
>> > > > >>> On 08 Aug 2014, at 12:41, Chris Riccomini
>> > > > >>><cr...@linkedin.com.INVALID> wrote:
>> > > > >>>
>> > > > >>>> Hey Telles,
>> > > > >>>>
>> > > > >>>>>> Do I need to have the job folder on each machine in my
>> cluster?
>> > > > >>>>
>> > > > >>>> No, you should not need to do this. There are two ways to
>>deploy
>> > > your
>> > > > >>>> tarball to the YARN grid. One is to put it in HDFS, and the
>> other
>> > is
>> > > > >>>>to
>> > > > >>>> put it on an HTTP server. The link to running a Samza job in
>>a
>> > > > >>>>multi-node
>> > > > >>>> YARN cluster describes how to do both (either HTTP server or
>> > HDFS).
>> > > > >>>>
>> > > > >>>> In both cases, once the tarball is put in on the HTTP/HDFS
>> > > server(s),
>> > > > >>>>you
>> > > > >>>> must update yarn.package.path to point to it. From there, the
>> YARN
>> > > NM
>> > > > >>>> should download it for you automatically when you start your
>> job.
>> > > > >>>>
>> > > > >>>> * Can you send along a paste of your job config?
>> > > > >>>>
>> > > > >>>> Cheers,
>> > > > >>>> Chris
>> > > > >>>>
>> > > > >>>> On 8/8/14 8:04 AM, "Claudio Martins"
>><cl...@mobileaware.com>
>> > > wrote:
>> > > > >>>>
>> > > > >>>>> Hi Telles, it looks to me that you forgot to update the
>> > > > >>>>> "yarn.package.path"
>> > > > >>>>> attribute in your config file for the task.
>> > > > >>>>>
>> > > > >>>>> - Claudio Martins
>> > > > >>>>> Head of Engineering
>> > > > >>>>> MobileAware USA Inc. / www.mobileaware.com
>> > > > >>>>> office: +1 617 986 5060 / mobile: +1 617 480 5288
>> > > > >>>>> linkedin: www.linkedin.com/in/martinsclaudio
>> > > > >>>>>
>> > > > >>>>>
>> > > > >>>>> On Fri, Aug 8, 2014 at 10:55 AM, Telles Nobrega
>> > > > >>>>><te...@gmail.com>
>> > > > >>>>> wrote:
>> > > > >>>>>
>> > > > >>>>>> Hi,
>> > > > >>>>>>
>> > > > >>>>>> this is my first time trying to run a job on a multinode
>> > > > >>>>>>environment. I
>> > > > >>>>>> have the cluster set up, I can see in the GUI that all
>>nodes
>> are
>> > > > >>>>>> working.
>> > > > >>>>>> Do I need to have the job folder on each machine in my
>> cluster?
>> > > > >>>>>> - The first time I tried running with the job on the
>>namenode
>> > > > >>>>>>machine
>> > > > >>>>>> and
>> > > > >>>>>> it failed saying:
>> > > > >>>>>>
>> > > > >>>>>> Application application_1407509228798_0001 failed 2 times
>>due
>> to
>> > > AM
>> > > > >>>>>> Container for appattempt_1407509228798_0001_000002 exited
>>with
>> > > > >>>>>>exitCode:
>> > > > >>>>>> -1000 due to: File
>> > > > >>>>>>
>> > > > >>>>>>
>> > > > >>>>>>
>> > > >
>> > >
>> >
>> 
>>>>>>>>file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-pa
>>>>>>>>ck
>> > > > >>>>>>age-
>> > > > >>>>>> 0.7.0-dist.tar.gz
>> > > > >>>>>> does not exist
>> > > > >>>>>>
>> > > > >>>>>> So I copied the folder to each machine in my cluster and
>>got
>> > this
>> > > > >>>>>>error:
>> > > > >>>>>>
>> > > > >>>>>> Application application_1407509228798_0002 failed 2 times
>>due
>> to
>> > > AM
>> > > > >>>>>> Container for appattempt_1407509228798_0002_000002 exited
>>with
>> > > > >>>>>>exitCode:
>> > > > >>>>>> -1000 due to: Resource
>> > > > >>>>>>
>> > > > >>>>>>
>> > > > >>>>>>
>> > > >
>> > >
>> >
>> 
>>>>>>>>file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-pa
>>>>>>>>ck
>> > > > >>>>>>age-
>> > > > >>>>>> 0.7.0-dist.tar.gz
>> > > > >>>>>> changed on src filesystem (expected 1407509168000, was
>> > > 1407509434000
>> > > > >>>>>>
>> > > > >>>>>> What am I missing?
>> > > > >>>>>>
>> > > > >>>>>> p.s.: I followed this
>> > > > >>>>>>
>> > > > >>>>>><
>> > > > 
>>https://github.com/yahoo/samoa/wiki/Executing-SAMOA-with-Apache-Samz
>> > > > >>>>>>a>
>> > > > >>>>>> tutorial
>> > > > >>>>>> and this
>> > > > >>>>>> <
>> > > > >>>>>>
>> > > > >>>>>>
>> > > > >>>>>>
>> > > >
>> http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-
>> > > > >>>>>>node
>> > > > >>>>>> -yarn.html
>> > > > >>>>>>>
>> > > > >>>>>> to
>> > > > >>>>>> set up the cluster.
>> > > > >>>>>>
>> > > > >>>>>> Help is much appreciated.
>> > > > >>>>>>
>> > > > >>>>>> Thanks in advance.
>> > > > >>>>>>
>> > > > >>>>>> --
>> > > > >>>>>> ------------------------------------------
>> > > > >>>>>> Telles Mota Vidal Nobrega
>> > > > >>>>>> M.sc. Candidate at UFCG
>> > > > >>>>>> B.sc. in Computer Science at UFCG
>> > > > >>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
>> > > > >>>>>>
>> > > > >>>>
>> > > > >>>
>> > > > >>
>> > > > >
>> > > >
>> > > >
>> > >
>> >
>> >
>> >
>> > --
>> > ------------------------------------------
>> > Telles Mota Vidal Nobrega
>> > M.sc. Candidate at UFCG
>> > B.sc. in Computer Science at UFCG
>> > Software Engineer at OpenStack Project - HP/LSD-UFCG
>> >
>>
>
>
>
>-- 
>------------------------------------------
>Telles Mota Vidal Nobrega
>M.sc. Candidate at UFCG
>B.sc. in Computer Science at UFCG
>Software Engineer at OpenStack Project - HP/LSD-UFCG


Re: Running Job on Multinode Yarn Cluster

Posted by Telles Nobrega <te...@gmail.com>.
Yes, it is like this:

<configuration>
  <property>
    <name>dfs.datanode.data.dir</name>
    <value>file:///home/ubuntu/hadoop-2.3.0/hdfs/datanode</value>
    <description>Comma separated list of paths on the local filesystem of a
DataNode where it should store its blocks.</description>
  </property>

  <property>
    <name>dfs.namenode.name.dir</name>
    <value>file:///home/ubuntu/hadoop-2.3.0/hdfs/namenode</value>
    <description>Path on the local filesystem where the NameNode stores the
namespace and transaction logs persistently.</description>
  </property>
</configuration>
~

I saw some report that this may be a classpath problem. Does this sounds
right to you?


On Mon, Aug 11, 2014 at 5:25 PM, Yan Fang <ya...@gmail.com> wrote:

> Hi Telles,
>
> It looks correct. Did you put the hdfs-site.xml into your HADOOP_CONF_DIR
> ?(such as ~/.samza/conf)
>
> Fang, Yan
> yanfang724@gmail.com
> +1 (206) 849-4108
>
>
> On Mon, Aug 11, 2014 at 1:02 PM, Telles Nobrega <te...@gmail.com>
> wrote:
>
> > ​Hi Yan Fang,
> >
> > I was able to deploy the file to hdfs, I can see them in all my nodes but
> > when I tried running I got this error:
> >
> > Exception in thread "main" java.io.IOException: No FileSystem for scheme:
> > hdfs
> > at
> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2421)
> >  at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428)
> > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
> >  at
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
> > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
> >  at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
> > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
> >  at
> >
> >
> org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.scala:111)
> > at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
> >  at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
> > at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
> >  at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
> > at org.apache.samza.job.JobRunner.main(JobRunner.scala)
> >
> >
> > This is my yarn.package.path config:
> >
> >
> >
> >
>  ​yarn.package.path=hdfs://telles-master-samza:50070/samza-job-package-0.7.0-dist.tar.gz
> >
> > Thanks in advance
> >
> >
> >
> >
> >
> > On Mon, Aug 11, 2014 at 3:00 PM, Yan Fang <ya...@gmail.com> wrote:
> >
> > > Hi Telles,
> > >
> > > In terms of "*I tried pushing the tar file to HDFS but I got an error
> > from
> > > hadoop saying that it couldn’t find core-site.xml file*.", I guess you
> > set
> > > the HADOOP_CONF_DIR variable and made it point to ~/.samza/conf. You
> can
> > do
> > > 1) make the HADOOP_CONF_DIR point to the directory where your conf
> files
> > > are, such as /etc/hadoop/conf. Or 2) copy the config files to
> > > ~/.samza/conf. Thank you,
> > >
> > > Cheer,
> > >
> > > Fang, Yan
> > > yanfang724@gmail.com
> > > +1 (206) 849-4108
> > >
> > >
> > > On Mon, Aug 11, 2014 at 7:40 AM, Chris Riccomini <
> > > criccomini@linkedin.com.invalid> wrote:
> > >
> > > > Hey Telles,
> > > >
> > > > To get YARN working with the HTTP file system, you need to follow the
> > > > instructions on:
> > > >
> > > >
> > >
> >
> http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-node-y
> > > > arn.html
> > > >
> > > >
> > > > In the "Set Up Http Filesystem for YARN" section.
> > > >
> > > > You shouldn't need to compile anything (no Gradle, which is what your
> > > > stack trace is showing). This setup should be done for all of the
> NMs,
> > > > since they will be the ones downloading your job's package (from
> > > > yarn.package.path).
> > > >
> > > > Cheers,
> > > > Chris
> > > >
> > > > On 8/9/14 9:44 PM, "Telles Nobrega" <te...@gmail.com> wrote:
> > > >
> > > > >Hi again, I tried installing the scala libs but the Http problem
> still
> > > > >occurs. I realised that I need to compile incubator samza in the
> > > machines
> > > > >that I¹m going to run the jobs, but the compilation fails with this
> > huge
> > > > >message:
> > > > >
> > > > >#
> > > > ># There is insufficient memory for the Java Runtime Environment to
> > > > >continue.
> > > > ># Native memory allocation (malloc) failed to allocate 3946053632
> > bytes
> > > > >for committing reserved memory.
> > > > ># An error report file with more information is saved as:
> > > > ># /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2506.log
> > > > >Could not write standard input into: Gradle Worker 13.
> > > > >java.io.IOException: Broken pipe
> > > > >       at java.io.FileOutputStream.writeBytes(Native Method)
> > > > >       at java.io.FileOutputStream.write(FileOutputStream.java:345)
> > > > >       at
> > > >
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> > > > >       at
> > > > java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
> > > > >       at
> > > >
> > >
> >
> >org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOutputH
> > > > >andleRunner.java:53)
> > > > >       at
> > > >
> > >
> >
> >org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp
> > > > >l$1.run(DefaultExecutorFactory.java:66)
> > > > >       at
> > > >
> > >
> >
> >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
> > > > >1145)
> > > > >       at
> > > >
> > >
> >
> >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
> > > > >:615)
> > > > >       at java.lang.Thread.run(Thread.java:744)
> > > > >Process 'Gradle Worker 13' finished with non-zero exit value 1
> > > > >org.gradle.process.internal.ExecException: Process 'Gradle Worker
> 13'
> > > > >finished with non-zero exit value 1
> > > > >       at
> > > >
> > >
> >
> >org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNormalE
> > > > >xitValue(DefaultExecHandle.java:362)
> > > > >       at
> > > >
> > >
> >
> >org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(DefaultWork
> > > > >erProcess.java:89)
> > > > >       at
> > > >
> > >
> >
> >org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWorkerP
> > > > >rocess.java:33)
> > > > >       at
> > > >
> > >
> >
> >org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(Defau
> > > > >ltWorkerProcess.java:55)
> > > > >       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> > > > >       at
> > > >
> > >
> >
> >sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:
> > > > >57)
> > > > >       at
> > > >
> > >
> >
> >sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm
> > > > >pl.java:43)
> > > > >       at java.lang.reflect.Method.invoke(Method.java:606)
> > > > >       at
> > > >
> > >
> >
> >org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat
> > > > >ch.java:35)
> > > > >       at
> > > >
> > >
> >
> >org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat
> > > > >ch.java:24)
> > > > >       at
> > > >
> > >
> >
> >org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:81)
> > > > >       at
> > > >
> > >
> >
> >org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:30)
> > > > >       at
> > > >
> > >
> >
> >org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocationHa
> > > > >ndler.invoke(ProxyDispatchAdapter.java:93)
> > > > >       at com.sun.proxy.$Proxy46.executionFinished(Unknown Source)
> > > > >       at
> > > >
> > >
> >
> >org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultExecH
> > > > >andle.java:212)
> > > > >       at
> > > >
> > >
> >
> >org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHandle.j
> > > > >ava:309)
> > > > >       at
> > > >
> > >
> >
> >org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunner.ja
> > > > >va:108)
> > > > >       at
> > > >
> > >
> >
> >org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java:88)
> > > > >       at
> > > >
> > >
> >
> >org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp
> > > > >l$1.run(DefaultExecutorFactory.java:66)
> > > > >       at
> > > >
> > >
> >
> >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
> > > > >1145)
> > > > >       at
> > > >
> > >
> >
> >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
> > > > >:615)
> > > > >       at java.lang.Thread.run(Thread.java:744)
> > > > >OpenJDK 64-Bit Server VM warning: INFO:
> > > > >os::commit_memory(0x000000070a6c0000, 3946053632, 0) failed;
> > > > >error='Cannot allocate memory' (errno=12)
> > > > >#
> > > > ># There is insufficient memory for the Java Runtime Environment to
> > > > >continue.
> > > > ># Native memory allocation (malloc) failed to allocate 3946053632
> > bytes
> > > > >for committing reserved memory.
> > > > ># An error report file with more information is saved as:
> > > > ># /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2518.log
> > > > >Could not write standard input into: Gradle Worker 14.
> > > > >java.io.IOException: Broken pipe
> > > > >       at java.io.FileOutputStream.writeBytes(Native Method)
> > > > >       at java.io.FileOutputStream.write(FileOutputStream.java:345)
> > > > >       at
> > > >
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> > > > >       at
> > > > java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
> > > > >       at
> > > >
> > >
> >
> >org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOutputH
> > > > >andleRunner.java:53)
> > > > >       at
> > > >
> > >
> >
> >org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp
> > > > >l$1.run(DefaultExecutorFactory.java:66)
> > > > >       at
> > > >
> > >
> >
> >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
> > > > >1145)
> > > > >       at
> > > >
> > >
> >
> >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
> > > > >:615)
> > > > >       at java.lang.Thread.run(Thread.java:744)
> > > > >Process 'Gradle Worker 14' finished with non-zero exit value 1
> > > > >org.gradle.process.internal.ExecException: Process 'Gradle Worker
> 14'
> > > > >finished with non-zero exit value 1
> > > > >       at
> > > >
> > >
> >
> >org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNormalE
> > > > >xitValue(DefaultExecHandle.java:362)
> > > > >       at
> > > >
> > >
> >
> >org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(DefaultWork
> > > > >erProcess.java:89)
> > > > >       at
> > > >
> > >
> >
> >org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWorkerP
> > > > >rocess.java:33)
> > > > >       at
> > > >
> > >
> >
> >org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(Defau
> > > > >ltWorkerProcess.java:55)
> > > > >       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> > > > >       at
> > > >
> > >
> >
> >sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:
> > > > >57)
> > > > >       at
> > > >
> > >
> >
> >sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm
> > > > >pl.java:43)
> > > > >       at java.lang.reflect.Method.invoke(Method.java:606)
> > > > >       at
> > > >
> > >
> >
> >org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat
> > > > >ch.java:35)
> > > > >       at
> > > >
> > >
> >
> >org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat
> > > > >ch.java:24)
> > > > >       at
> > > >
> > >
> >
> >org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:81)
> > > > >       at
> > > >
> > >
> >
> >org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:30)
> > > > >       at
> > > >
> > >
> >
> >org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocationHa
> > > > >ndler.invoke(ProxyDispatchAdapter.java:93)
> > > > >       at com.sun.proxy.$Proxy46.executionFinished(Unknown Source)
> > > > >       at
> > > >
> > >
> >
> >org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultExecH
> > > > >andle.java:212)
> > > > >       at
> > > >
> > >
> >
> >org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHandle.j
> > > > >ava:309)
> > > > >       at
> > > >
> > >
> >
> >org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunner.ja
> > > > >va:108)
> > > > >       at
> > > >
> > >
> >
> >org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java:88)
> > > > >       at
> > > >
> > >
> >
> >org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp
> > > > >l$1.run(DefaultExecutorFactory.java:66)
> > > > >       at
> > > >
> > >
> >
> >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
> > > > >1145)
> > > > >       at
> > > >
> > >
> >
> >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
> > > > >:615)
> > > > >       at java.lang.Thread.r
> > > > >
> > > > >Do I need more memory for my machines? Each already has 4GB. I
> really
> > > > >need to have this running. I¹m not sure which way is best http or
> hdfs
> > > > >which one you suggest and how can i solve my problem for each case.
> > > > >
> > > > >Thanks in advance and sorry for bothering this much.
> > > > >On 10 Aug 2014, at 00:20, Telles Nobrega <te...@gmail.com>
> > > wrote:
> > > > >
> > > > >> Hi Chris, now I have the tar file in my RM machine, and the yarn
> > path
> > > > >>points to it. I changed the core-site.xml to use HttpFileSystem
> > instead
> > > > >>of HDFS now it is failing with
> > > > >>
> > > > >> Application application_1407640485281_0001 failed 2 times due to
> AM
> > > > >>Container for appattempt_1407640485281_0001_000002 exited with
> > > > >>exitCode:-1000 due to: java.lang.ClassNotFoundException: Class
> > > > >>org.apache.samza.util.hadoop.HttpFileSystem not found
> > > > >>
> > > > >> I think I can solve this just installing scala files from the
> samza
> > > > >>tutorial, can you confirm that?
> > > > >>
> > > > >> On 09 Aug 2014, at 08:34, Telles Nobrega <tellesnobrega@gmail.com
> >
> > > > >>wrote:
> > > > >>
> > > > >>> Hi Chris,
> > > > >>>
> > > > >>> I think the problem is that I forgot to update the
> > yarn.job.package.
> > > > >>> I will try again to see if it works now.
> > > > >>>
> > > > >>> I have one more question, how can I stop (command line) the jobs
> > > > >>>running in my topology, for the experiment that I will run, I need
> > to
> > > > >>>run the same job in 4 minutes intervals. So I need to kill it,
> clean
> > > > >>>the kafka topics and rerun.
> > > > >>>
> > > > >>> Thanks in advance.
> > > > >>>
> > > > >>> On 08 Aug 2014, at 12:41, Chris Riccomini
> > > > >>><cr...@linkedin.com.INVALID> wrote:
> > > > >>>
> > > > >>>> Hey Telles,
> > > > >>>>
> > > > >>>>>> Do I need to have the job folder on each machine in my
> cluster?
> > > > >>>>
> > > > >>>> No, you should not need to do this. There are two ways to deploy
> > > your
> > > > >>>> tarball to the YARN grid. One is to put it in HDFS, and the
> other
> > is
> > > > >>>>to
> > > > >>>> put it on an HTTP server. The link to running a Samza job in a
> > > > >>>>multi-node
> > > > >>>> YARN cluster describes how to do both (either HTTP server or
> > HDFS).
> > > > >>>>
> > > > >>>> In both cases, once the tarball is put in on the HTTP/HDFS
> > > server(s),
> > > > >>>>you
> > > > >>>> must update yarn.package.path to point to it. From there, the
> YARN
> > > NM
> > > > >>>> should download it for you automatically when you start your
> job.
> > > > >>>>
> > > > >>>> * Can you send along a paste of your job config?
> > > > >>>>
> > > > >>>> Cheers,
> > > > >>>> Chris
> > > > >>>>
> > > > >>>> On 8/8/14 8:04 AM, "Claudio Martins" <cl...@mobileaware.com>
> > > wrote:
> > > > >>>>
> > > > >>>>> Hi Telles, it looks to me that you forgot to update the
> > > > >>>>> "yarn.package.path"
> > > > >>>>> attribute in your config file for the task.
> > > > >>>>>
> > > > >>>>> - Claudio Martins
> > > > >>>>> Head of Engineering
> > > > >>>>> MobileAware USA Inc. / www.mobileaware.com
> > > > >>>>> office: +1 617 986 5060 / mobile: +1 617 480 5288
> > > > >>>>> linkedin: www.linkedin.com/in/martinsclaudio
> > > > >>>>>
> > > > >>>>>
> > > > >>>>> On Fri, Aug 8, 2014 at 10:55 AM, Telles Nobrega
> > > > >>>>><te...@gmail.com>
> > > > >>>>> wrote:
> > > > >>>>>
> > > > >>>>>> Hi,
> > > > >>>>>>
> > > > >>>>>> this is my first time trying to run a job on a multinode
> > > > >>>>>>environment. I
> > > > >>>>>> have the cluster set up, I can see in the GUI that all nodes
> are
> > > > >>>>>> working.
> > > > >>>>>> Do I need to have the job folder on each machine in my
> cluster?
> > > > >>>>>> - The first time I tried running with the job on the namenode
> > > > >>>>>>machine
> > > > >>>>>> and
> > > > >>>>>> it failed saying:
> > > > >>>>>>
> > > > >>>>>> Application application_1407509228798_0001 failed 2 times due
> to
> > > AM
> > > > >>>>>> Container for appattempt_1407509228798_0001_000002 exited with
> > > > >>>>>>exitCode:
> > > > >>>>>> -1000 due to: File
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>>
> > > >
> > >
> >
> >>>>>>file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-pack
> > > > >>>>>>age-
> > > > >>>>>> 0.7.0-dist.tar.gz
> > > > >>>>>> does not exist
> > > > >>>>>>
> > > > >>>>>> So I copied the folder to each machine in my cluster and got
> > this
> > > > >>>>>>error:
> > > > >>>>>>
> > > > >>>>>> Application application_1407509228798_0002 failed 2 times due
> to
> > > AM
> > > > >>>>>> Container for appattempt_1407509228798_0002_000002 exited with
> > > > >>>>>>exitCode:
> > > > >>>>>> -1000 due to: Resource
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>>
> > > >
> > >
> >
> >>>>>>file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-pack
> > > > >>>>>>age-
> > > > >>>>>> 0.7.0-dist.tar.gz
> > > > >>>>>> changed on src filesystem (expected 1407509168000, was
> > > 1407509434000
> > > > >>>>>>
> > > > >>>>>> What am I missing?
> > > > >>>>>>
> > > > >>>>>> p.s.: I followed this
> > > > >>>>>>
> > > > >>>>>><
> > > > https://github.com/yahoo/samoa/wiki/Executing-SAMOA-with-Apache-Samz
> > > > >>>>>>a>
> > > > >>>>>> tutorial
> > > > >>>>>> and this
> > > > >>>>>> <
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>>
> > > >
> http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-
> > > > >>>>>>node
> > > > >>>>>> -yarn.html
> > > > >>>>>>>
> > > > >>>>>> to
> > > > >>>>>> set up the cluster.
> > > > >>>>>>
> > > > >>>>>> Help is much appreciated.
> > > > >>>>>>
> > > > >>>>>> Thanks in advance.
> > > > >>>>>>
> > > > >>>>>> --
> > > > >>>>>> ------------------------------------------
> > > > >>>>>> Telles Mota Vidal Nobrega
> > > > >>>>>> M.sc. Candidate at UFCG
> > > > >>>>>> B.sc. in Computer Science at UFCG
> > > > >>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
> > > > >>>>>>
> > > > >>>>
> > > > >>>
> > > > >>
> > > > >
> > > >
> > > >
> > >
> >
> >
> >
> > --
> > ------------------------------------------
> > Telles Mota Vidal Nobrega
> > M.sc. Candidate at UFCG
> > B.sc. in Computer Science at UFCG
> > Software Engineer at OpenStack Project - HP/LSD-UFCG
> >
>



-- 
------------------------------------------
Telles Mota Vidal Nobrega
M.sc. Candidate at UFCG
B.sc. in Computer Science at UFCG
Software Engineer at OpenStack Project - HP/LSD-UFCG

Re: Running Job on Multinode Yarn Cluster

Posted by Yan Fang <ya...@gmail.com>.
Hi Telles,

It looks correct. Did you put the hdfs-site.xml into your HADOOP_CONF_DIR
?(such as ~/.samza/conf)

Fang, Yan
yanfang724@gmail.com
+1 (206) 849-4108


On Mon, Aug 11, 2014 at 1:02 PM, Telles Nobrega <te...@gmail.com>
wrote:

> ​Hi Yan Fang,
>
> I was able to deploy the file to hdfs, I can see them in all my nodes but
> when I tried running I got this error:
>
> Exception in thread "main" java.io.IOException: No FileSystem for scheme:
> hdfs
> at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2421)
>  at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428)
> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
>  at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
>  at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
>  at
>
> org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.scala:111)
> at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
>  at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
> at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
>  at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
> at org.apache.samza.job.JobRunner.main(JobRunner.scala)
>
>
> This is my yarn.package.path config:
>
>
>
>  ​yarn.package.path=hdfs://telles-master-samza:50070/samza-job-package-0.7.0-dist.tar.gz
>
> Thanks in advance
>
>
>
>
>
> On Mon, Aug 11, 2014 at 3:00 PM, Yan Fang <ya...@gmail.com> wrote:
>
> > Hi Telles,
> >
> > In terms of "*I tried pushing the tar file to HDFS but I got an error
> from
> > hadoop saying that it couldn’t find core-site.xml file*.", I guess you
> set
> > the HADOOP_CONF_DIR variable and made it point to ~/.samza/conf. You can
> do
> > 1) make the HADOOP_CONF_DIR point to the directory where your conf files
> > are, such as /etc/hadoop/conf. Or 2) copy the config files to
> > ~/.samza/conf. Thank you,
> >
> > Cheer,
> >
> > Fang, Yan
> > yanfang724@gmail.com
> > +1 (206) 849-4108
> >
> >
> > On Mon, Aug 11, 2014 at 7:40 AM, Chris Riccomini <
> > criccomini@linkedin.com.invalid> wrote:
> >
> > > Hey Telles,
> > >
> > > To get YARN working with the HTTP file system, you need to follow the
> > > instructions on:
> > >
> > >
> >
> http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-node-y
> > > arn.html
> > >
> > >
> > > In the "Set Up Http Filesystem for YARN" section.
> > >
> > > You shouldn't need to compile anything (no Gradle, which is what your
> > > stack trace is showing). This setup should be done for all of the NMs,
> > > since they will be the ones downloading your job's package (from
> > > yarn.package.path).
> > >
> > > Cheers,
> > > Chris
> > >
> > > On 8/9/14 9:44 PM, "Telles Nobrega" <te...@gmail.com> wrote:
> > >
> > > >Hi again, I tried installing the scala libs but the Http problem still
> > > >occurs. I realised that I need to compile incubator samza in the
> > machines
> > > >that I¹m going to run the jobs, but the compilation fails with this
> huge
> > > >message:
> > > >
> > > >#
> > > ># There is insufficient memory for the Java Runtime Environment to
> > > >continue.
> > > ># Native memory allocation (malloc) failed to allocate 3946053632
> bytes
> > > >for committing reserved memory.
> > > ># An error report file with more information is saved as:
> > > ># /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2506.log
> > > >Could not write standard input into: Gradle Worker 13.
> > > >java.io.IOException: Broken pipe
> > > >       at java.io.FileOutputStream.writeBytes(Native Method)
> > > >       at java.io.FileOutputStream.write(FileOutputStream.java:345)
> > > >       at
> > > java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> > > >       at
> > > java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
> > > >       at
> > >
> >
> >org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOutputH
> > > >andleRunner.java:53)
> > > >       at
> > >
> >
> >org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp
> > > >l$1.run(DefaultExecutorFactory.java:66)
> > > >       at
> > >
> >
> >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
> > > >1145)
> > > >       at
> > >
> >
> >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
> > > >:615)
> > > >       at java.lang.Thread.run(Thread.java:744)
> > > >Process 'Gradle Worker 13' finished with non-zero exit value 1
> > > >org.gradle.process.internal.ExecException: Process 'Gradle Worker 13'
> > > >finished with non-zero exit value 1
> > > >       at
> > >
> >
> >org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNormalE
> > > >xitValue(DefaultExecHandle.java:362)
> > > >       at
> > >
> >
> >org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(DefaultWork
> > > >erProcess.java:89)
> > > >       at
> > >
> >
> >org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWorkerP
> > > >rocess.java:33)
> > > >       at
> > >
> >
> >org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(Defau
> > > >ltWorkerProcess.java:55)
> > > >       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > > >       at
> > >
> >
> >sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:
> > > >57)
> > > >       at
> > >
> >
> >sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm
> > > >pl.java:43)
> > > >       at java.lang.reflect.Method.invoke(Method.java:606)
> > > >       at
> > >
> >
> >org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat
> > > >ch.java:35)
> > > >       at
> > >
> >
> >org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat
> > > >ch.java:24)
> > > >       at
> > >
> >
> >org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:81)
> > > >       at
> > >
> >
> >org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:30)
> > > >       at
> > >
> >
> >org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocationHa
> > > >ndler.invoke(ProxyDispatchAdapter.java:93)
> > > >       at com.sun.proxy.$Proxy46.executionFinished(Unknown Source)
> > > >       at
> > >
> >
> >org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultExecH
> > > >andle.java:212)
> > > >       at
> > >
> >
> >org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHandle.j
> > > >ava:309)
> > > >       at
> > >
> >
> >org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunner.ja
> > > >va:108)
> > > >       at
> > >
> >
> >org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java:88)
> > > >       at
> > >
> >
> >org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp
> > > >l$1.run(DefaultExecutorFactory.java:66)
> > > >       at
> > >
> >
> >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
> > > >1145)
> > > >       at
> > >
> >
> >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
> > > >:615)
> > > >       at java.lang.Thread.run(Thread.java:744)
> > > >OpenJDK 64-Bit Server VM warning: INFO:
> > > >os::commit_memory(0x000000070a6c0000, 3946053632, 0) failed;
> > > >error='Cannot allocate memory' (errno=12)
> > > >#
> > > ># There is insufficient memory for the Java Runtime Environment to
> > > >continue.
> > > ># Native memory allocation (malloc) failed to allocate 3946053632
> bytes
> > > >for committing reserved memory.
> > > ># An error report file with more information is saved as:
> > > ># /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2518.log
> > > >Could not write standard input into: Gradle Worker 14.
> > > >java.io.IOException: Broken pipe
> > > >       at java.io.FileOutputStream.writeBytes(Native Method)
> > > >       at java.io.FileOutputStream.write(FileOutputStream.java:345)
> > > >       at
> > > java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> > > >       at
> > > java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
> > > >       at
> > >
> >
> >org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOutputH
> > > >andleRunner.java:53)
> > > >       at
> > >
> >
> >org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp
> > > >l$1.run(DefaultExecutorFactory.java:66)
> > > >       at
> > >
> >
> >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
> > > >1145)
> > > >       at
> > >
> >
> >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
> > > >:615)
> > > >       at java.lang.Thread.run(Thread.java:744)
> > > >Process 'Gradle Worker 14' finished with non-zero exit value 1
> > > >org.gradle.process.internal.ExecException: Process 'Gradle Worker 14'
> > > >finished with non-zero exit value 1
> > > >       at
> > >
> >
> >org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNormalE
> > > >xitValue(DefaultExecHandle.java:362)
> > > >       at
> > >
> >
> >org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(DefaultWork
> > > >erProcess.java:89)
> > > >       at
> > >
> >
> >org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWorkerP
> > > >rocess.java:33)
> > > >       at
> > >
> >
> >org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(Defau
> > > >ltWorkerProcess.java:55)
> > > >       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > > >       at
> > >
> >
> >sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:
> > > >57)
> > > >       at
> > >
> >
> >sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm
> > > >pl.java:43)
> > > >       at java.lang.reflect.Method.invoke(Method.java:606)
> > > >       at
> > >
> >
> >org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat
> > > >ch.java:35)
> > > >       at
> > >
> >
> >org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat
> > > >ch.java:24)
> > > >       at
> > >
> >
> >org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:81)
> > > >       at
> > >
> >
> >org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:30)
> > > >       at
> > >
> >
> >org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocationHa
> > > >ndler.invoke(ProxyDispatchAdapter.java:93)
> > > >       at com.sun.proxy.$Proxy46.executionFinished(Unknown Source)
> > > >       at
> > >
> >
> >org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultExecH
> > > >andle.java:212)
> > > >       at
> > >
> >
> >org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHandle.j
> > > >ava:309)
> > > >       at
> > >
> >
> >org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunner.ja
> > > >va:108)
> > > >       at
> > >
> >
> >org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java:88)
> > > >       at
> > >
> >
> >org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp
> > > >l$1.run(DefaultExecutorFactory.java:66)
> > > >       at
> > >
> >
> >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
> > > >1145)
> > > >       at
> > >
> >
> >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
> > > >:615)
> > > >       at java.lang.Thread.r
> > > >
> > > >Do I need more memory for my machines? Each already has 4GB. I really
> > > >need to have this running. I¹m not sure which way is best http or hdfs
> > > >which one you suggest and how can i solve my problem for each case.
> > > >
> > > >Thanks in advance and sorry for bothering this much.
> > > >On 10 Aug 2014, at 00:20, Telles Nobrega <te...@gmail.com>
> > wrote:
> > > >
> > > >> Hi Chris, now I have the tar file in my RM machine, and the yarn
> path
> > > >>points to it. I changed the core-site.xml to use HttpFileSystem
> instead
> > > >>of HDFS now it is failing with
> > > >>
> > > >> Application application_1407640485281_0001 failed 2 times due to AM
> > > >>Container for appattempt_1407640485281_0001_000002 exited with
> > > >>exitCode:-1000 due to: java.lang.ClassNotFoundException: Class
> > > >>org.apache.samza.util.hadoop.HttpFileSystem not found
> > > >>
> > > >> I think I can solve this just installing scala files from the samza
> > > >>tutorial, can you confirm that?
> > > >>
> > > >> On 09 Aug 2014, at 08:34, Telles Nobrega <te...@gmail.com>
> > > >>wrote:
> > > >>
> > > >>> Hi Chris,
> > > >>>
> > > >>> I think the problem is that I forgot to update the
> yarn.job.package.
> > > >>> I will try again to see if it works now.
> > > >>>
> > > >>> I have one more question, how can I stop (command line) the jobs
> > > >>>running in my topology, for the experiment that I will run, I need
> to
> > > >>>run the same job in 4 minutes intervals. So I need to kill it, clean
> > > >>>the kafka topics and rerun.
> > > >>>
> > > >>> Thanks in advance.
> > > >>>
> > > >>> On 08 Aug 2014, at 12:41, Chris Riccomini
> > > >>><cr...@linkedin.com.INVALID> wrote:
> > > >>>
> > > >>>> Hey Telles,
> > > >>>>
> > > >>>>>> Do I need to have the job folder on each machine in my cluster?
> > > >>>>
> > > >>>> No, you should not need to do this. There are two ways to deploy
> > your
> > > >>>> tarball to the YARN grid. One is to put it in HDFS, and the other
> is
> > > >>>>to
> > > >>>> put it on an HTTP server. The link to running a Samza job in a
> > > >>>>multi-node
> > > >>>> YARN cluster describes how to do both (either HTTP server or
> HDFS).
> > > >>>>
> > > >>>> In both cases, once the tarball is put in on the HTTP/HDFS
> > server(s),
> > > >>>>you
> > > >>>> must update yarn.package.path to point to it. From there, the YARN
> > NM
> > > >>>> should download it for you automatically when you start your job.
> > > >>>>
> > > >>>> * Can you send along a paste of your job config?
> > > >>>>
> > > >>>> Cheers,
> > > >>>> Chris
> > > >>>>
> > > >>>> On 8/8/14 8:04 AM, "Claudio Martins" <cl...@mobileaware.com>
> > wrote:
> > > >>>>
> > > >>>>> Hi Telles, it looks to me that you forgot to update the
> > > >>>>> "yarn.package.path"
> > > >>>>> attribute in your config file for the task.
> > > >>>>>
> > > >>>>> - Claudio Martins
> > > >>>>> Head of Engineering
> > > >>>>> MobileAware USA Inc. / www.mobileaware.com
> > > >>>>> office: +1 617 986 5060 / mobile: +1 617 480 5288
> > > >>>>> linkedin: www.linkedin.com/in/martinsclaudio
> > > >>>>>
> > > >>>>>
> > > >>>>> On Fri, Aug 8, 2014 at 10:55 AM, Telles Nobrega
> > > >>>>><te...@gmail.com>
> > > >>>>> wrote:
> > > >>>>>
> > > >>>>>> Hi,
> > > >>>>>>
> > > >>>>>> this is my first time trying to run a job on a multinode
> > > >>>>>>environment. I
> > > >>>>>> have the cluster set up, I can see in the GUI that all nodes are
> > > >>>>>> working.
> > > >>>>>> Do I need to have the job folder on each machine in my cluster?
> > > >>>>>> - The first time I tried running with the job on the namenode
> > > >>>>>>machine
> > > >>>>>> and
> > > >>>>>> it failed saying:
> > > >>>>>>
> > > >>>>>> Application application_1407509228798_0001 failed 2 times due to
> > AM
> > > >>>>>> Container for appattempt_1407509228798_0001_000002 exited with
> > > >>>>>>exitCode:
> > > >>>>>> -1000 due to: File
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>
> > >
> >
> >>>>>>file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-pack
> > > >>>>>>age-
> > > >>>>>> 0.7.0-dist.tar.gz
> > > >>>>>> does not exist
> > > >>>>>>
> > > >>>>>> So I copied the folder to each machine in my cluster and got
> this
> > > >>>>>>error:
> > > >>>>>>
> > > >>>>>> Application application_1407509228798_0002 failed 2 times due to
> > AM
> > > >>>>>> Container for appattempt_1407509228798_0002_000002 exited with
> > > >>>>>>exitCode:
> > > >>>>>> -1000 due to: Resource
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>
> > >
> >
> >>>>>>file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-pack
> > > >>>>>>age-
> > > >>>>>> 0.7.0-dist.tar.gz
> > > >>>>>> changed on src filesystem (expected 1407509168000, was
> > 1407509434000
> > > >>>>>>
> > > >>>>>> What am I missing?
> > > >>>>>>
> > > >>>>>> p.s.: I followed this
> > > >>>>>>
> > > >>>>>><
> > > https://github.com/yahoo/samoa/wiki/Executing-SAMOA-with-Apache-Samz
> > > >>>>>>a>
> > > >>>>>> tutorial
> > > >>>>>> and this
> > > >>>>>> <
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>
> > > http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-
> > > >>>>>>node
> > > >>>>>> -yarn.html
> > > >>>>>>>
> > > >>>>>> to
> > > >>>>>> set up the cluster.
> > > >>>>>>
> > > >>>>>> Help is much appreciated.
> > > >>>>>>
> > > >>>>>> Thanks in advance.
> > > >>>>>>
> > > >>>>>> --
> > > >>>>>> ------------------------------------------
> > > >>>>>> Telles Mota Vidal Nobrega
> > > >>>>>> M.sc. Candidate at UFCG
> > > >>>>>> B.sc. in Computer Science at UFCG
> > > >>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
> > > >>>>>>
> > > >>>>
> > > >>>
> > > >>
> > > >
> > >
> > >
> >
>
>
>
> --
> ------------------------------------------
> Telles Mota Vidal Nobrega
> M.sc. Candidate at UFCG
> B.sc. in Computer Science at UFCG
> Software Engineer at OpenStack Project - HP/LSD-UFCG
>

Re: Running Job on Multinode Yarn Cluster

Posted by Telles Nobrega <te...@gmail.com>.
​Hi Yan Fang,

I was able to deploy the file to hdfs, I can see them in all my nodes but
when I tried running I got this error:

Exception in thread "main" java.io.IOException: No FileSystem for scheme:
hdfs
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2421)
 at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
 at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
 at
org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.scala:111)
at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
 at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
 at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
at org.apache.samza.job.JobRunner.main(JobRunner.scala)


This is my yarn.package.path config:


 ​yarn.package.path=hdfs://telles-master-samza:50070/samza-job-package-0.7.0-dist.tar.gz

Thanks in advance





On Mon, Aug 11, 2014 at 3:00 PM, Yan Fang <ya...@gmail.com> wrote:

> Hi Telles,
>
> In terms of "*I tried pushing the tar file to HDFS but I got an error from
> hadoop saying that it couldn’t find core-site.xml file*.", I guess you set
> the HADOOP_CONF_DIR variable and made it point to ~/.samza/conf. You can do
> 1) make the HADOOP_CONF_DIR point to the directory where your conf files
> are, such as /etc/hadoop/conf. Or 2) copy the config files to
> ~/.samza/conf. Thank you,
>
> Cheer,
>
> Fang, Yan
> yanfang724@gmail.com
> +1 (206) 849-4108
>
>
> On Mon, Aug 11, 2014 at 7:40 AM, Chris Riccomini <
> criccomini@linkedin.com.invalid> wrote:
>
> > Hey Telles,
> >
> > To get YARN working with the HTTP file system, you need to follow the
> > instructions on:
> >
> >
> http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-node-y
> > arn.html
> >
> >
> > In the "Set Up Http Filesystem for YARN" section.
> >
> > You shouldn't need to compile anything (no Gradle, which is what your
> > stack trace is showing). This setup should be done for all of the NMs,
> > since they will be the ones downloading your job's package (from
> > yarn.package.path).
> >
> > Cheers,
> > Chris
> >
> > On 8/9/14 9:44 PM, "Telles Nobrega" <te...@gmail.com> wrote:
> >
> > >Hi again, I tried installing the scala libs but the Http problem still
> > >occurs. I realised that I need to compile incubator samza in the
> machines
> > >that I¹m going to run the jobs, but the compilation fails with this huge
> > >message:
> > >
> > >#
> > ># There is insufficient memory for the Java Runtime Environment to
> > >continue.
> > ># Native memory allocation (malloc) failed to allocate 3946053632 bytes
> > >for committing reserved memory.
> > ># An error report file with more information is saved as:
> > ># /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2506.log
> > >Could not write standard input into: Gradle Worker 13.
> > >java.io.IOException: Broken pipe
> > >       at java.io.FileOutputStream.writeBytes(Native Method)
> > >       at java.io.FileOutputStream.write(FileOutputStream.java:345)
> > >       at
> > java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> > >       at
> > java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
> > >       at
> >
> >org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOutputH
> > >andleRunner.java:53)
> > >       at
> >
> >org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp
> > >l$1.run(DefaultExecutorFactory.java:66)
> > >       at
> >
> >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
> > >1145)
> > >       at
> >
> >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
> > >:615)
> > >       at java.lang.Thread.run(Thread.java:744)
> > >Process 'Gradle Worker 13' finished with non-zero exit value 1
> > >org.gradle.process.internal.ExecException: Process 'Gradle Worker 13'
> > >finished with non-zero exit value 1
> > >       at
> >
> >org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNormalE
> > >xitValue(DefaultExecHandle.java:362)
> > >       at
> >
> >org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(DefaultWork
> > >erProcess.java:89)
> > >       at
> >
> >org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWorkerP
> > >rocess.java:33)
> > >       at
> >
> >org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(Defau
> > >ltWorkerProcess.java:55)
> > >       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >       at
> >
> >sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:
> > >57)
> > >       at
> >
> >sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm
> > >pl.java:43)
> > >       at java.lang.reflect.Method.invoke(Method.java:606)
> > >       at
> >
> >org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat
> > >ch.java:35)
> > >       at
> >
> >org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat
> > >ch.java:24)
> > >       at
> >
> >org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:81)
> > >       at
> >
> >org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:30)
> > >       at
> >
> >org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocationHa
> > >ndler.invoke(ProxyDispatchAdapter.java:93)
> > >       at com.sun.proxy.$Proxy46.executionFinished(Unknown Source)
> > >       at
> >
> >org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultExecH
> > >andle.java:212)
> > >       at
> >
> >org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHandle.j
> > >ava:309)
> > >       at
> >
> >org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunner.ja
> > >va:108)
> > >       at
> >
> >org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java:88)
> > >       at
> >
> >org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp
> > >l$1.run(DefaultExecutorFactory.java:66)
> > >       at
> >
> >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
> > >1145)
> > >       at
> >
> >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
> > >:615)
> > >       at java.lang.Thread.run(Thread.java:744)
> > >OpenJDK 64-Bit Server VM warning: INFO:
> > >os::commit_memory(0x000000070a6c0000, 3946053632, 0) failed;
> > >error='Cannot allocate memory' (errno=12)
> > >#
> > ># There is insufficient memory for the Java Runtime Environment to
> > >continue.
> > ># Native memory allocation (malloc) failed to allocate 3946053632 bytes
> > >for committing reserved memory.
> > ># An error report file with more information is saved as:
> > ># /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2518.log
> > >Could not write standard input into: Gradle Worker 14.
> > >java.io.IOException: Broken pipe
> > >       at java.io.FileOutputStream.writeBytes(Native Method)
> > >       at java.io.FileOutputStream.write(FileOutputStream.java:345)
> > >       at
> > java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> > >       at
> > java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
> > >       at
> >
> >org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOutputH
> > >andleRunner.java:53)
> > >       at
> >
> >org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp
> > >l$1.run(DefaultExecutorFactory.java:66)
> > >       at
> >
> >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
> > >1145)
> > >       at
> >
> >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
> > >:615)
> > >       at java.lang.Thread.run(Thread.java:744)
> > >Process 'Gradle Worker 14' finished with non-zero exit value 1
> > >org.gradle.process.internal.ExecException: Process 'Gradle Worker 14'
> > >finished with non-zero exit value 1
> > >       at
> >
> >org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNormalE
> > >xitValue(DefaultExecHandle.java:362)
> > >       at
> >
> >org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(DefaultWork
> > >erProcess.java:89)
> > >       at
> >
> >org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWorkerP
> > >rocess.java:33)
> > >       at
> >
> >org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(Defau
> > >ltWorkerProcess.java:55)
> > >       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >       at
> >
> >sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:
> > >57)
> > >       at
> >
> >sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm
> > >pl.java:43)
> > >       at java.lang.reflect.Method.invoke(Method.java:606)
> > >       at
> >
> >org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat
> > >ch.java:35)
> > >       at
> >
> >org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat
> > >ch.java:24)
> > >       at
> >
> >org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:81)
> > >       at
> >
> >org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:30)
> > >       at
> >
> >org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocationHa
> > >ndler.invoke(ProxyDispatchAdapter.java:93)
> > >       at com.sun.proxy.$Proxy46.executionFinished(Unknown Source)
> > >       at
> >
> >org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultExecH
> > >andle.java:212)
> > >       at
> >
> >org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHandle.j
> > >ava:309)
> > >       at
> >
> >org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunner.ja
> > >va:108)
> > >       at
> >
> >org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java:88)
> > >       at
> >
> >org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp
> > >l$1.run(DefaultExecutorFactory.java:66)
> > >       at
> >
> >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
> > >1145)
> > >       at
> >
> >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
> > >:615)
> > >       at java.lang.Thread.r
> > >
> > >Do I need more memory for my machines? Each already has 4GB. I really
> > >need to have this running. I¹m not sure which way is best http or hdfs
> > >which one you suggest and how can i solve my problem for each case.
> > >
> > >Thanks in advance and sorry for bothering this much.
> > >On 10 Aug 2014, at 00:20, Telles Nobrega <te...@gmail.com>
> wrote:
> > >
> > >> Hi Chris, now I have the tar file in my RM machine, and the yarn path
> > >>points to it. I changed the core-site.xml to use HttpFileSystem instead
> > >>of HDFS now it is failing with
> > >>
> > >> Application application_1407640485281_0001 failed 2 times due to AM
> > >>Container for appattempt_1407640485281_0001_000002 exited with
> > >>exitCode:-1000 due to: java.lang.ClassNotFoundException: Class
> > >>org.apache.samza.util.hadoop.HttpFileSystem not found
> > >>
> > >> I think I can solve this just installing scala files from the samza
> > >>tutorial, can you confirm that?
> > >>
> > >> On 09 Aug 2014, at 08:34, Telles Nobrega <te...@gmail.com>
> > >>wrote:
> > >>
> > >>> Hi Chris,
> > >>>
> > >>> I think the problem is that I forgot to update the yarn.job.package.
> > >>> I will try again to see if it works now.
> > >>>
> > >>> I have one more question, how can I stop (command line) the jobs
> > >>>running in my topology, for the experiment that I will run, I need to
> > >>>run the same job in 4 minutes intervals. So I need to kill it, clean
> > >>>the kafka topics and rerun.
> > >>>
> > >>> Thanks in advance.
> > >>>
> > >>> On 08 Aug 2014, at 12:41, Chris Riccomini
> > >>><cr...@linkedin.com.INVALID> wrote:
> > >>>
> > >>>> Hey Telles,
> > >>>>
> > >>>>>> Do I need to have the job folder on each machine in my cluster?
> > >>>>
> > >>>> No, you should not need to do this. There are two ways to deploy
> your
> > >>>> tarball to the YARN grid. One is to put it in HDFS, and the other is
> > >>>>to
> > >>>> put it on an HTTP server. The link to running a Samza job in a
> > >>>>multi-node
> > >>>> YARN cluster describes how to do both (either HTTP server or HDFS).
> > >>>>
> > >>>> In both cases, once the tarball is put in on the HTTP/HDFS
> server(s),
> > >>>>you
> > >>>> must update yarn.package.path to point to it. From there, the YARN
> NM
> > >>>> should download it for you automatically when you start your job.
> > >>>>
> > >>>> * Can you send along a paste of your job config?
> > >>>>
> > >>>> Cheers,
> > >>>> Chris
> > >>>>
> > >>>> On 8/8/14 8:04 AM, "Claudio Martins" <cl...@mobileaware.com>
> wrote:
> > >>>>
> > >>>>> Hi Telles, it looks to me that you forgot to update the
> > >>>>> "yarn.package.path"
> > >>>>> attribute in your config file for the task.
> > >>>>>
> > >>>>> - Claudio Martins
> > >>>>> Head of Engineering
> > >>>>> MobileAware USA Inc. / www.mobileaware.com
> > >>>>> office: +1 617 986 5060 / mobile: +1 617 480 5288
> > >>>>> linkedin: www.linkedin.com/in/martinsclaudio
> > >>>>>
> > >>>>>
> > >>>>> On Fri, Aug 8, 2014 at 10:55 AM, Telles Nobrega
> > >>>>><te...@gmail.com>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> Hi,
> > >>>>>>
> > >>>>>> this is my first time trying to run a job on a multinode
> > >>>>>>environment. I
> > >>>>>> have the cluster set up, I can see in the GUI that all nodes are
> > >>>>>> working.
> > >>>>>> Do I need to have the job folder on each machine in my cluster?
> > >>>>>> - The first time I tried running with the job on the namenode
> > >>>>>>machine
> > >>>>>> and
> > >>>>>> it failed saying:
> > >>>>>>
> > >>>>>> Application application_1407509228798_0001 failed 2 times due to
> AM
> > >>>>>> Container for appattempt_1407509228798_0001_000002 exited with
> > >>>>>>exitCode:
> > >>>>>> -1000 due to: File
> > >>>>>>
> > >>>>>>
> > >>>>>>
> >
> >>>>>>file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-pack
> > >>>>>>age-
> > >>>>>> 0.7.0-dist.tar.gz
> > >>>>>> does not exist
> > >>>>>>
> > >>>>>> So I copied the folder to each machine in my cluster and got this
> > >>>>>>error:
> > >>>>>>
> > >>>>>> Application application_1407509228798_0002 failed 2 times due to
> AM
> > >>>>>> Container for appattempt_1407509228798_0002_000002 exited with
> > >>>>>>exitCode:
> > >>>>>> -1000 due to: Resource
> > >>>>>>
> > >>>>>>
> > >>>>>>
> >
> >>>>>>file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-pack
> > >>>>>>age-
> > >>>>>> 0.7.0-dist.tar.gz
> > >>>>>> changed on src filesystem (expected 1407509168000, was
> 1407509434000
> > >>>>>>
> > >>>>>> What am I missing?
> > >>>>>>
> > >>>>>> p.s.: I followed this
> > >>>>>>
> > >>>>>><
> > https://github.com/yahoo/samoa/wiki/Executing-SAMOA-with-Apache-Samz
> > >>>>>>a>
> > >>>>>> tutorial
> > >>>>>> and this
> > >>>>>> <
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-
> > >>>>>>node
> > >>>>>> -yarn.html
> > >>>>>>>
> > >>>>>> to
> > >>>>>> set up the cluster.
> > >>>>>>
> > >>>>>> Help is much appreciated.
> > >>>>>>
> > >>>>>> Thanks in advance.
> > >>>>>>
> > >>>>>> --
> > >>>>>> ------------------------------------------
> > >>>>>> Telles Mota Vidal Nobrega
> > >>>>>> M.sc. Candidate at UFCG
> > >>>>>> B.sc. in Computer Science at UFCG
> > >>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
> > >>>>>>
> > >>>>
> > >>>
> > >>
> > >
> >
> >
>



-- 
------------------------------------------
Telles Mota Vidal Nobrega
M.sc. Candidate at UFCG
B.sc. in Computer Science at UFCG
Software Engineer at OpenStack Project - HP/LSD-UFCG

Re: Running Job on Multinode Yarn Cluster

Posted by Yan Fang <ya...@gmail.com>.
Hi Telles,

In terms of "*I tried pushing the tar file to HDFS but I got an error from
hadoop saying that it couldn’t find core-site.xml file*.", I guess you set
the HADOOP_CONF_DIR variable and made it point to ~/.samza/conf. You can do
1) make the HADOOP_CONF_DIR point to the directory where your conf files
are, such as /etc/hadoop/conf. Or 2) copy the config files to
~/.samza/conf. Thank you,

Cheer,

Fang, Yan
yanfang724@gmail.com
+1 (206) 849-4108


On Mon, Aug 11, 2014 at 7:40 AM, Chris Riccomini <
criccomini@linkedin.com.invalid> wrote:

> Hey Telles,
>
> To get YARN working with the HTTP file system, you need to follow the
> instructions on:
>
> http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-node-y
> arn.html
>
>
> In the "Set Up Http Filesystem for YARN" section.
>
> You shouldn't need to compile anything (no Gradle, which is what your
> stack trace is showing). This setup should be done for all of the NMs,
> since they will be the ones downloading your job's package (from
> yarn.package.path).
>
> Cheers,
> Chris
>
> On 8/9/14 9:44 PM, "Telles Nobrega" <te...@gmail.com> wrote:
>
> >Hi again, I tried installing the scala libs but the Http problem still
> >occurs. I realised that I need to compile incubator samza in the machines
> >that I¹m going to run the jobs, but the compilation fails with this huge
> >message:
> >
> >#
> ># There is insufficient memory for the Java Runtime Environment to
> >continue.
> ># Native memory allocation (malloc) failed to allocate 3946053632 bytes
> >for committing reserved memory.
> ># An error report file with more information is saved as:
> ># /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2506.log
> >Could not write standard input into: Gradle Worker 13.
> >java.io.IOException: Broken pipe
> >       at java.io.FileOutputStream.writeBytes(Native Method)
> >       at java.io.FileOutputStream.write(FileOutputStream.java:345)
> >       at
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> >       at
> java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
> >       at
> >org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOutputH
> >andleRunner.java:53)
> >       at
> >org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp
> >l$1.run(DefaultExecutorFactory.java:66)
> >       at
> >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
> >1145)
> >       at
> >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
> >:615)
> >       at java.lang.Thread.run(Thread.java:744)
> >Process 'Gradle Worker 13' finished with non-zero exit value 1
> >org.gradle.process.internal.ExecException: Process 'Gradle Worker 13'
> >finished with non-zero exit value 1
> >       at
> >org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNormalE
> >xitValue(DefaultExecHandle.java:362)
> >       at
> >org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(DefaultWork
> >erProcess.java:89)
> >       at
> >org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWorkerP
> >rocess.java:33)
> >       at
> >org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(Defau
> >ltWorkerProcess.java:55)
> >       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >       at
> >sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:
> >57)
> >       at
> >sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm
> >pl.java:43)
> >       at java.lang.reflect.Method.invoke(Method.java:606)
> >       at
> >org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat
> >ch.java:35)
> >       at
> >org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat
> >ch.java:24)
> >       at
> >org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:81)
> >       at
> >org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:30)
> >       at
> >org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocationHa
> >ndler.invoke(ProxyDispatchAdapter.java:93)
> >       at com.sun.proxy.$Proxy46.executionFinished(Unknown Source)
> >       at
> >org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultExecH
> >andle.java:212)
> >       at
> >org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHandle.j
> >ava:309)
> >       at
> >org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunner.ja
> >va:108)
> >       at
> >org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java:88)
> >       at
> >org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp
> >l$1.run(DefaultExecutorFactory.java:66)
> >       at
> >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
> >1145)
> >       at
> >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
> >:615)
> >       at java.lang.Thread.run(Thread.java:744)
> >OpenJDK 64-Bit Server VM warning: INFO:
> >os::commit_memory(0x000000070a6c0000, 3946053632, 0) failed;
> >error='Cannot allocate memory' (errno=12)
> >#
> ># There is insufficient memory for the Java Runtime Environment to
> >continue.
> ># Native memory allocation (malloc) failed to allocate 3946053632 bytes
> >for committing reserved memory.
> ># An error report file with more information is saved as:
> ># /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2518.log
> >Could not write standard input into: Gradle Worker 14.
> >java.io.IOException: Broken pipe
> >       at java.io.FileOutputStream.writeBytes(Native Method)
> >       at java.io.FileOutputStream.write(FileOutputStream.java:345)
> >       at
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> >       at
> java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
> >       at
> >org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOutputH
> >andleRunner.java:53)
> >       at
> >org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp
> >l$1.run(DefaultExecutorFactory.java:66)
> >       at
> >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
> >1145)
> >       at
> >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
> >:615)
> >       at java.lang.Thread.run(Thread.java:744)
> >Process 'Gradle Worker 14' finished with non-zero exit value 1
> >org.gradle.process.internal.ExecException: Process 'Gradle Worker 14'
> >finished with non-zero exit value 1
> >       at
> >org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNormalE
> >xitValue(DefaultExecHandle.java:362)
> >       at
> >org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(DefaultWork
> >erProcess.java:89)
> >       at
> >org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWorkerP
> >rocess.java:33)
> >       at
> >org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(Defau
> >ltWorkerProcess.java:55)
> >       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >       at
> >sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:
> >57)
> >       at
> >sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm
> >pl.java:43)
> >       at java.lang.reflect.Method.invoke(Method.java:606)
> >       at
> >org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat
> >ch.java:35)
> >       at
> >org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat
> >ch.java:24)
> >       at
> >org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:81)
> >       at
> >org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:30)
> >       at
> >org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocationHa
> >ndler.invoke(ProxyDispatchAdapter.java:93)
> >       at com.sun.proxy.$Proxy46.executionFinished(Unknown Source)
> >       at
> >org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultExecH
> >andle.java:212)
> >       at
> >org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHandle.j
> >ava:309)
> >       at
> >org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunner.ja
> >va:108)
> >       at
> >org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java:88)
> >       at
> >org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp
> >l$1.run(DefaultExecutorFactory.java:66)
> >       at
> >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
> >1145)
> >       at
> >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
> >:615)
> >       at java.lang.Thread.r
> >
> >Do I need more memory for my machines? Each already has 4GB. I really
> >need to have this running. I¹m not sure which way is best http or hdfs
> >which one you suggest and how can i solve my problem for each case.
> >
> >Thanks in advance and sorry for bothering this much.
> >On 10 Aug 2014, at 00:20, Telles Nobrega <te...@gmail.com> wrote:
> >
> >> Hi Chris, now I have the tar file in my RM machine, and the yarn path
> >>points to it. I changed the core-site.xml to use HttpFileSystem instead
> >>of HDFS now it is failing with
> >>
> >> Application application_1407640485281_0001 failed 2 times due to AM
> >>Container for appattempt_1407640485281_0001_000002 exited with
> >>exitCode:-1000 due to: java.lang.ClassNotFoundException: Class
> >>org.apache.samza.util.hadoop.HttpFileSystem not found
> >>
> >> I think I can solve this just installing scala files from the samza
> >>tutorial, can you confirm that?
> >>
> >> On 09 Aug 2014, at 08:34, Telles Nobrega <te...@gmail.com>
> >>wrote:
> >>
> >>> Hi Chris,
> >>>
> >>> I think the problem is that I forgot to update the yarn.job.package.
> >>> I will try again to see if it works now.
> >>>
> >>> I have one more question, how can I stop (command line) the jobs
> >>>running in my topology, for the experiment that I will run, I need to
> >>>run the same job in 4 minutes intervals. So I need to kill it, clean
> >>>the kafka topics and rerun.
> >>>
> >>> Thanks in advance.
> >>>
> >>> On 08 Aug 2014, at 12:41, Chris Riccomini
> >>><cr...@linkedin.com.INVALID> wrote:
> >>>
> >>>> Hey Telles,
> >>>>
> >>>>>> Do I need to have the job folder on each machine in my cluster?
> >>>>
> >>>> No, you should not need to do this. There are two ways to deploy your
> >>>> tarball to the YARN grid. One is to put it in HDFS, and the other is
> >>>>to
> >>>> put it on an HTTP server. The link to running a Samza job in a
> >>>>multi-node
> >>>> YARN cluster describes how to do both (either HTTP server or HDFS).
> >>>>
> >>>> In both cases, once the tarball is put in on the HTTP/HDFS server(s),
> >>>>you
> >>>> must update yarn.package.path to point to it. From there, the YARN NM
> >>>> should download it for you automatically when you start your job.
> >>>>
> >>>> * Can you send along a paste of your job config?
> >>>>
> >>>> Cheers,
> >>>> Chris
> >>>>
> >>>> On 8/8/14 8:04 AM, "Claudio Martins" <cl...@mobileaware.com> wrote:
> >>>>
> >>>>> Hi Telles, it looks to me that you forgot to update the
> >>>>> "yarn.package.path"
> >>>>> attribute in your config file for the task.
> >>>>>
> >>>>> - Claudio Martins
> >>>>> Head of Engineering
> >>>>> MobileAware USA Inc. / www.mobileaware.com
> >>>>> office: +1 617 986 5060 / mobile: +1 617 480 5288
> >>>>> linkedin: www.linkedin.com/in/martinsclaudio
> >>>>>
> >>>>>
> >>>>> On Fri, Aug 8, 2014 at 10:55 AM, Telles Nobrega
> >>>>><te...@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> this is my first time trying to run a job on a multinode
> >>>>>>environment. I
> >>>>>> have the cluster set up, I can see in the GUI that all nodes are
> >>>>>> working.
> >>>>>> Do I need to have the job folder on each machine in my cluster?
> >>>>>> - The first time I tried running with the job on the namenode
> >>>>>>machine
> >>>>>> and
> >>>>>> it failed saying:
> >>>>>>
> >>>>>> Application application_1407509228798_0001 failed 2 times due to AM
> >>>>>> Container for appattempt_1407509228798_0001_000002 exited with
> >>>>>>exitCode:
> >>>>>> -1000 due to: File
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-pack
> >>>>>>age-
> >>>>>> 0.7.0-dist.tar.gz
> >>>>>> does not exist
> >>>>>>
> >>>>>> So I copied the folder to each machine in my cluster and got this
> >>>>>>error:
> >>>>>>
> >>>>>> Application application_1407509228798_0002 failed 2 times due to AM
> >>>>>> Container for appattempt_1407509228798_0002_000002 exited with
> >>>>>>exitCode:
> >>>>>> -1000 due to: Resource
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-pack
> >>>>>>age-
> >>>>>> 0.7.0-dist.tar.gz
> >>>>>> changed on src filesystem (expected 1407509168000, was 1407509434000
> >>>>>>
> >>>>>> What am I missing?
> >>>>>>
> >>>>>> p.s.: I followed this
> >>>>>>
> >>>>>><
> https://github.com/yahoo/samoa/wiki/Executing-SAMOA-with-Apache-Samz
> >>>>>>a>
> >>>>>> tutorial
> >>>>>> and this
> >>>>>> <
> >>>>>>
> >>>>>>
> >>>>>>
> http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-
> >>>>>>node
> >>>>>> -yarn.html
> >>>>>>>
> >>>>>> to
> >>>>>> set up the cluster.
> >>>>>>
> >>>>>> Help is much appreciated.
> >>>>>>
> >>>>>> Thanks in advance.
> >>>>>>
> >>>>>> --
> >>>>>> ------------------------------------------
> >>>>>> Telles Mota Vidal Nobrega
> >>>>>> M.sc. Candidate at UFCG
> >>>>>> B.sc. in Computer Science at UFCG
> >>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
> >>>>>>
> >>>>
> >>>
> >>
> >
>
>

Re: Running Job on Multinode Yarn Cluster

Posted by Chris Riccomini <cr...@linkedin.com.INVALID>.
Hey Telles,

To get YARN working with the HTTP file system, you need to follow the
instructions on:

http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-node-y
arn.html


In the "Set Up Http Filesystem for YARN" section.

You shouldn't need to compile anything (no Gradle, which is what your
stack trace is showing). This setup should be done for all of the NMs,
since they will be the ones downloading your job's package (from
yarn.package.path).

Cheers,
Chris

On 8/9/14 9:44 PM, "Telles Nobrega" <te...@gmail.com> wrote:

>Hi again, I tried installing the scala libs but the Http problem still
>occurs. I realised that I need to compile incubator samza in the machines
>that I¹m going to run the jobs, but the compilation fails with this huge
>message: 
>
>#
># There is insufficient memory for the Java Runtime Environment to
>continue.
># Native memory allocation (malloc) failed to allocate 3946053632 bytes
>for committing reserved memory.
># An error report file with more information is saved as:
># /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2506.log
>Could not write standard input into: Gradle Worker 13.
>java.io.IOException: Broken pipe
>	at java.io.FileOutputStream.writeBytes(Native Method)
>	at java.io.FileOutputStream.write(FileOutputStream.java:345)
>	at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
>	at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
>	at 
>org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOutputH
>andleRunner.java:53)
>	at 
>org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp
>l$1.run(DefaultExecutorFactory.java:66)
>	at 
>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
>1145)
>	at 
>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
>:615)
>	at java.lang.Thread.run(Thread.java:744)
>Process 'Gradle Worker 13' finished with non-zero exit value 1
>org.gradle.process.internal.ExecException: Process 'Gradle Worker 13'
>finished with non-zero exit value 1
>	at 
>org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNormalE
>xitValue(DefaultExecHandle.java:362)
>	at 
>org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(DefaultWork
>erProcess.java:89)
>	at 
>org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWorkerP
>rocess.java:33)
>	at 
>org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(Defau
>ltWorkerProcess.java:55)
>	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>	at 
>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:
>57)
>	at 
>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm
>pl.java:43)
>	at java.lang.reflect.Method.invoke(Method.java:606)
>	at 
>org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat
>ch.java:35)
>	at 
>org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat
>ch.java:24)
>	at 
>org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:81)
>	at 
>org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:30)
>	at 
>org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocationHa
>ndler.invoke(ProxyDispatchAdapter.java:93)
>	at com.sun.proxy.$Proxy46.executionFinished(Unknown Source)
>	at 
>org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultExecH
>andle.java:212)
>	at 
>org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHandle.j
>ava:309)
>	at 
>org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunner.ja
>va:108)
>	at 
>org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java:88)
>	at 
>org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp
>l$1.run(DefaultExecutorFactory.java:66)
>	at 
>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
>1145)
>	at 
>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
>:615)
>	at java.lang.Thread.run(Thread.java:744)
>OpenJDK 64-Bit Server VM warning: INFO:
>os::commit_memory(0x000000070a6c0000, 3946053632, 0) failed;
>error='Cannot allocate memory' (errno=12)
>#
># There is insufficient memory for the Java Runtime Environment to
>continue.
># Native memory allocation (malloc) failed to allocate 3946053632 bytes
>for committing reserved memory.
># An error report file with more information is saved as:
># /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2518.log
>Could not write standard input into: Gradle Worker 14.
>java.io.IOException: Broken pipe
>	at java.io.FileOutputStream.writeBytes(Native Method)
>	at java.io.FileOutputStream.write(FileOutputStream.java:345)
>	at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
>	at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
>	at 
>org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOutputH
>andleRunner.java:53)
>	at 
>org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp
>l$1.run(DefaultExecutorFactory.java:66)
>	at 
>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
>1145)
>	at 
>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
>:615)
>	at java.lang.Thread.run(Thread.java:744)
>Process 'Gradle Worker 14' finished with non-zero exit value 1
>org.gradle.process.internal.ExecException: Process 'Gradle Worker 14'
>finished with non-zero exit value 1
>	at 
>org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNormalE
>xitValue(DefaultExecHandle.java:362)
>	at 
>org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(DefaultWork
>erProcess.java:89)
>	at 
>org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWorkerP
>rocess.java:33)
>	at 
>org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(Defau
>ltWorkerProcess.java:55)
>	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>	at 
>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:
>57)
>	at 
>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm
>pl.java:43)
>	at java.lang.reflect.Method.invoke(Method.java:606)
>	at 
>org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat
>ch.java:35)
>	at 
>org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat
>ch.java:24)
>	at 
>org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:81)
>	at 
>org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:30)
>	at 
>org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocationHa
>ndler.invoke(ProxyDispatchAdapter.java:93)
>	at com.sun.proxy.$Proxy46.executionFinished(Unknown Source)
>	at 
>org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultExecH
>andle.java:212)
>	at 
>org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHandle.j
>ava:309)
>	at 
>org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunner.ja
>va:108)
>	at 
>org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java:88)
>	at 
>org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp
>l$1.run(DefaultExecutorFactory.java:66)
>	at 
>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
>1145)
>	at 
>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
>:615)
>	at java.lang.Thread.r
>
>Do I need more memory for my machines? Each already has 4GB. I really
>need to have this running. I¹m not sure which way is best http or hdfs
>which one you suggest and how can i solve my problem for each case.
>
>Thanks in advance and sorry for bothering this much.
>On 10 Aug 2014, at 00:20, Telles Nobrega <te...@gmail.com> wrote:
>
>> Hi Chris, now I have the tar file in my RM machine, and the yarn path
>>points to it. I changed the core-site.xml to use HttpFileSystem instead
>>of HDFS now it is failing with
>> 
>> Application application_1407640485281_0001 failed 2 times due to AM
>>Container for appattempt_1407640485281_0001_000002 exited with
>>exitCode:-1000 due to: java.lang.ClassNotFoundException: Class
>>org.apache.samza.util.hadoop.HttpFileSystem not found
>> 
>> I think I can solve this just installing scala files from the samza
>>tutorial, can you confirm that?
>> 
>> On 09 Aug 2014, at 08:34, Telles Nobrega <te...@gmail.com>
>>wrote:
>> 
>>> Hi Chris,
>>> 
>>> I think the problem is that I forgot to update the yarn.job.package.
>>> I will try again to see if it works now.
>>> 
>>> I have one more question, how can I stop (command line) the jobs
>>>running in my topology, for the experiment that I will run, I need to
>>>run the same job in 4 minutes intervals. So I need to kill it, clean
>>>the kafka topics and rerun.
>>> 
>>> Thanks in advance.
>>> 
>>> On 08 Aug 2014, at 12:41, Chris Riccomini
>>><cr...@linkedin.com.INVALID> wrote:
>>> 
>>>> Hey Telles,
>>>> 
>>>>>> Do I need to have the job folder on each machine in my cluster?
>>>> 
>>>> No, you should not need to do this. There are two ways to deploy your
>>>> tarball to the YARN grid. One is to put it in HDFS, and the other is
>>>>to
>>>> put it on an HTTP server. The link to running a Samza job in a
>>>>multi-node
>>>> YARN cluster describes how to do both (either HTTP server or HDFS).
>>>> 
>>>> In both cases, once the tarball is put in on the HTTP/HDFS server(s),
>>>>you
>>>> must update yarn.package.path to point to it. From there, the YARN NM
>>>> should download it for you automatically when you start your job.
>>>> 
>>>> * Can you send along a paste of your job config?
>>>> 
>>>> Cheers,
>>>> Chris
>>>> 
>>>> On 8/8/14 8:04 AM, "Claudio Martins" <cl...@mobileaware.com> wrote:
>>>> 
>>>>> Hi Telles, it looks to me that you forgot to update the
>>>>> "yarn.package.path"
>>>>> attribute in your config file for the task.
>>>>> 
>>>>> - Claudio Martins
>>>>> Head of Engineering
>>>>> MobileAware USA Inc. / www.mobileaware.com
>>>>> office: +1  617 986 5060 / mobile: +1 617 480 5288
>>>>> linkedin: www.linkedin.com/in/martinsclaudio
>>>>> 
>>>>> 
>>>>> On Fri, Aug 8, 2014 at 10:55 AM, Telles Nobrega
>>>>><te...@gmail.com>
>>>>> wrote:
>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> this is my first time trying to run a job on a multinode
>>>>>>environment. I
>>>>>> have the cluster set up, I can see in the GUI that all nodes are
>>>>>> working.
>>>>>> Do I need to have the job folder on each machine in my cluster?
>>>>>> - The first time I tried running with the job on the namenode
>>>>>>machine
>>>>>> and
>>>>>> it failed saying:
>>>>>> 
>>>>>> Application application_1407509228798_0001 failed 2 times due to AM
>>>>>> Container for appattempt_1407509228798_0001_000002 exited with
>>>>>>exitCode:
>>>>>> -1000 due to: File
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-pack
>>>>>>age-
>>>>>> 0.7.0-dist.tar.gz
>>>>>> does not exist
>>>>>> 
>>>>>> So I copied the folder to each machine in my cluster and got this
>>>>>>error:
>>>>>> 
>>>>>> Application application_1407509228798_0002 failed 2 times due to AM
>>>>>> Container for appattempt_1407509228798_0002_000002 exited with
>>>>>>exitCode:
>>>>>> -1000 due to: Resource
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-pack
>>>>>>age-
>>>>>> 0.7.0-dist.tar.gz
>>>>>> changed on src filesystem (expected 1407509168000, was 1407509434000
>>>>>> 
>>>>>> What am I missing?
>>>>>> 
>>>>>> p.s.: I followed this
>>>>>> 
>>>>>><https://github.com/yahoo/samoa/wiki/Executing-SAMOA-with-Apache-Samz
>>>>>>a>
>>>>>> tutorial
>>>>>> and this
>>>>>> <
>>>>>> 
>>>>>> 
>>>>>>http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-
>>>>>>node
>>>>>> -yarn.html
>>>>>>> 
>>>>>> to
>>>>>> set up the cluster.
>>>>>> 
>>>>>> Help is much appreciated.
>>>>>> 
>>>>>> Thanks in advance.
>>>>>> 
>>>>>> --
>>>>>> ------------------------------------------
>>>>>> Telles Mota Vidal Nobrega
>>>>>> M.sc. Candidate at UFCG
>>>>>> B.sc. in Computer Science at UFCG
>>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
>>>>>> 
>>>> 
>>> 
>> 
>


Re: Running Job on Multinode Yarn Cluster

Posted by Telles Nobrega <te...@gmail.com>.
Hi again, I tried installing the scala libs but the Http problem still occurs. I realised that I need to compile incubator samza in the machines that I’m going to run the jobs, but the compilation fails with this huge message: 

#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 3946053632 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2506.log
Could not write standard input into: Gradle Worker 13.
java.io.IOException: Broken pipe
	at java.io.FileOutputStream.writeBytes(Native Method)
	at java.io.FileOutputStream.write(FileOutputStream.java:345)
	at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
	at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
	at org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOutputHandleRunner.java:53)
	at org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImpl$1.run(DefaultExecutorFactory.java:66)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:744)
Process 'Gradle Worker 13' finished with non-zero exit value 1
org.gradle.process.internal.ExecException: Process 'Gradle Worker 13' finished with non-zero exit value 1
	at org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNormalExitValue(DefaultExecHandle.java:362)
	at org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(DefaultWorkerProcess.java:89)
	at org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWorkerProcess.java:33)
	at org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(DefaultWorkerProcess.java:55)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
	at org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
	at org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:81)
	at org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:30)
	at org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
	at com.sun.proxy.$Proxy46.executionFinished(Unknown Source)
	at org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultExecHandle.java:212)
	at org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHandle.java:309)
	at org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunner.java:108)
	at org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java:88)
	at org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImpl$1.run(DefaultExecutorFactory.java:66)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:744)
OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x000000070a6c0000, 3946053632, 0) failed; error='Cannot allocate memory' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 3946053632 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2518.log
Could not write standard input into: Gradle Worker 14.
java.io.IOException: Broken pipe
	at java.io.FileOutputStream.writeBytes(Native Method)
	at java.io.FileOutputStream.write(FileOutputStream.java:345)
	at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
	at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
	at org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOutputHandleRunner.java:53)
	at org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImpl$1.run(DefaultExecutorFactory.java:66)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:744)
Process 'Gradle Worker 14' finished with non-zero exit value 1
org.gradle.process.internal.ExecException: Process 'Gradle Worker 14' finished with non-zero exit value 1
	at org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNormalExitValue(DefaultExecHandle.java:362)
	at org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(DefaultWorkerProcess.java:89)
	at org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWorkerProcess.java:33)
	at org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(DefaultWorkerProcess.java:55)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
	at org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
	at org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:81)
	at org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:30)
	at org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
	at com.sun.proxy.$Proxy46.executionFinished(Unknown Source)
	at org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultExecHandle.java:212)
	at org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHandle.java:309)
	at org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunner.java:108)
	at org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java:88)
	at org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImpl$1.run(DefaultExecutorFactory.java:66)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.r

Do I need more memory for my machines? Each already has 4GB. I really need to have this running. I’m not sure which way is best http or hdfs which one you suggest and how can i solve my problem for each case.

Thanks in advance and sorry for bothering this much.
On 10 Aug 2014, at 00:20, Telles Nobrega <te...@gmail.com> wrote:

> Hi Chris, now I have the tar file in my RM machine, and the yarn path points to it. I changed the core-site.xml to use HttpFileSystem instead of HDFS now it is failing with
> 
> Application application_1407640485281_0001 failed 2 times due to AM Container for appattempt_1407640485281_0001_000002 exited with exitCode:-1000 due to: java.lang.ClassNotFoundException: Class org.apache.samza.util.hadoop.HttpFileSystem not found
> 
> I think I can solve this just installing scala files from the samza tutorial, can you confirm that?
> 
> On 09 Aug 2014, at 08:34, Telles Nobrega <te...@gmail.com> wrote:
> 
>> Hi Chris,
>> 
>> I think the problem is that I forgot to update the yarn.job.package.
>> I will try again to see if it works now.
>> 
>> I have one more question, how can I stop (command line) the jobs running in my topology, for the experiment that I will run, I need to run the same job in 4 minutes intervals. So I need to kill it, clean the kafka topics and rerun.
>> 
>> Thanks in advance.
>> 
>> On 08 Aug 2014, at 12:41, Chris Riccomini <cr...@linkedin.com.INVALID> wrote:
>> 
>>> Hey Telles,
>>> 
>>>>> Do I need to have the job folder on each machine in my cluster?
>>> 
>>> No, you should not need to do this. There are two ways to deploy your
>>> tarball to the YARN grid. One is to put it in HDFS, and the other is to
>>> put it on an HTTP server. The link to running a Samza job in a multi-node
>>> YARN cluster describes how to do both (either HTTP server or HDFS).
>>> 
>>> In both cases, once the tarball is put in on the HTTP/HDFS server(s), you
>>> must update yarn.package.path to point to it. From there, the YARN NM
>>> should download it for you automatically when you start your job.
>>> 
>>> * Can you send along a paste of your job config?
>>> 
>>> Cheers,
>>> Chris
>>> 
>>> On 8/8/14 8:04 AM, "Claudio Martins" <cl...@mobileaware.com> wrote:
>>> 
>>>> Hi Telles, it looks to me that you forgot to update the
>>>> "yarn.package.path"
>>>> attribute in your config file for the task.
>>>> 
>>>> - Claudio Martins
>>>> Head of Engineering
>>>> MobileAware USA Inc. / www.mobileaware.com
>>>> office: +1  617 986 5060 / mobile: +1 617 480 5288
>>>> linkedin: www.linkedin.com/in/martinsclaudio
>>>> 
>>>> 
>>>> On Fri, Aug 8, 2014 at 10:55 AM, Telles Nobrega <te...@gmail.com>
>>>> wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> this is my first time trying to run a job on a multinode environment. I
>>>>> have the cluster set up, I can see in the GUI that all nodes are
>>>>> working.
>>>>> Do I need to have the job folder on each machine in my cluster?
>>>>> - The first time I tried running with the job on the namenode machine
>>>>> and
>>>>> it failed saying:
>>>>> 
>>>>> Application application_1407509228798_0001 failed 2 times due to AM
>>>>> Container for appattempt_1407509228798_0001_000002 exited with exitCode:
>>>>> -1000 due to: File
>>>>> 
>>>>> 
>>>>> file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-package-
>>>>> 0.7.0-dist.tar.gz
>>>>> does not exist
>>>>> 
>>>>> So I copied the folder to each machine in my cluster and got this error:
>>>>> 
>>>>> Application application_1407509228798_0002 failed 2 times due to AM
>>>>> Container for appattempt_1407509228798_0002_000002 exited with exitCode:
>>>>> -1000 due to: Resource
>>>>> 
>>>>> 
>>>>> file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-package-
>>>>> 0.7.0-dist.tar.gz
>>>>> changed on src filesystem (expected 1407509168000, was 1407509434000
>>>>> 
>>>>> What am I missing?
>>>>> 
>>>>> p.s.: I followed this
>>>>> <https://github.com/yahoo/samoa/wiki/Executing-SAMOA-with-Apache-Samza>
>>>>> tutorial
>>>>> and this
>>>>> <
>>>>> 
>>>>> http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-node
>>>>> -yarn.html
>>>>>> 
>>>>> to
>>>>> set up the cluster.
>>>>> 
>>>>> Help is much appreciated.
>>>>> 
>>>>> Thanks in advance.
>>>>> 
>>>>> --
>>>>> ------------------------------------------
>>>>> Telles Mota Vidal Nobrega
>>>>> M.sc. Candidate at UFCG
>>>>> B.sc. in Computer Science at UFCG
>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
>>>>> 
>>> 
>> 
> 


Re: Running Job on Multinode Yarn Cluster

Posted by Telles Nobrega <te...@gmail.com>.
I tried pushing the tar file to HDFS but I got an error from hadoop saying that it couldn’t find core-site.xml file.

Have you seen this before?

Thanks,

On 10 Aug 2014, at 00:20, Telles Nobrega <te...@gmail.com> wrote:

> Hi Chris, now I have the tar file in my RM machine, and the yarn path points to it. I changed the core-site.xml to use HttpFileSystem instead of HDFS now it is failing with
> 
> Application application_1407640485281_0001 failed 2 times due to AM Container for appattempt_1407640485281_0001_000002 exited with exitCode:-1000 due to: java.lang.ClassNotFoundException: Class org.apache.samza.util.hadoop.HttpFileSystem not found
> 
> I think I can solve this just installing scala files from the samza tutorial, can you confirm that?
> 
> On 09 Aug 2014, at 08:34, Telles Nobrega <te...@gmail.com> wrote:
> 
>> Hi Chris,
>> 
>> I think the problem is that I forgot to update the yarn.job.package.
>> I will try again to see if it works now.
>> 
>> I have one more question, how can I stop (command line) the jobs running in my topology, for the experiment that I will run, I need to run the same job in 4 minutes intervals. So I need to kill it, clean the kafka topics and rerun.
>> 
>> Thanks in advance.
>> 
>> On 08 Aug 2014, at 12:41, Chris Riccomini <cr...@linkedin.com.INVALID> wrote:
>> 
>>> Hey Telles,
>>> 
>>>>> Do I need to have the job folder on each machine in my cluster?
>>> 
>>> No, you should not need to do this. There are two ways to deploy your
>>> tarball to the YARN grid. One is to put it in HDFS, and the other is to
>>> put it on an HTTP server. The link to running a Samza job in a multi-node
>>> YARN cluster describes how to do both (either HTTP server or HDFS).
>>> 
>>> In both cases, once the tarball is put in on the HTTP/HDFS server(s), you
>>> must update yarn.package.path to point to it. From there, the YARN NM
>>> should download it for you automatically when you start your job.
>>> 
>>> * Can you send along a paste of your job config?
>>> 
>>> Cheers,
>>> Chris
>>> 
>>> On 8/8/14 8:04 AM, "Claudio Martins" <cl...@mobileaware.com> wrote:
>>> 
>>>> Hi Telles, it looks to me that you forgot to update the
>>>> "yarn.package.path"
>>>> attribute in your config file for the task.
>>>> 
>>>> - Claudio Martins
>>>> Head of Engineering
>>>> MobileAware USA Inc. / www.mobileaware.com
>>>> office: +1  617 986 5060 / mobile: +1 617 480 5288
>>>> linkedin: www.linkedin.com/in/martinsclaudio
>>>> 
>>>> 
>>>> On Fri, Aug 8, 2014 at 10:55 AM, Telles Nobrega <te...@gmail.com>
>>>> wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> this is my first time trying to run a job on a multinode environment. I
>>>>> have the cluster set up, I can see in the GUI that all nodes are
>>>>> working.
>>>>> Do I need to have the job folder on each machine in my cluster?
>>>>> - The first time I tried running with the job on the namenode machine
>>>>> and
>>>>> it failed saying:
>>>>> 
>>>>> Application application_1407509228798_0001 failed 2 times due to AM
>>>>> Container for appattempt_1407509228798_0001_000002 exited with exitCode:
>>>>> -1000 due to: File
>>>>> 
>>>>> 
>>>>> file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-package-
>>>>> 0.7.0-dist.tar.gz
>>>>> does not exist
>>>>> 
>>>>> So I copied the folder to each machine in my cluster and got this error:
>>>>> 
>>>>> Application application_1407509228798_0002 failed 2 times due to AM
>>>>> Container for appattempt_1407509228798_0002_000002 exited with exitCode:
>>>>> -1000 due to: Resource
>>>>> 
>>>>> 
>>>>> file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-package-
>>>>> 0.7.0-dist.tar.gz
>>>>> changed on src filesystem (expected 1407509168000, was 1407509434000
>>>>> 
>>>>> What am I missing?
>>>>> 
>>>>> p.s.: I followed this
>>>>> <https://github.com/yahoo/samoa/wiki/Executing-SAMOA-with-Apache-Samza>
>>>>> tutorial
>>>>> and this
>>>>> <
>>>>> 
>>>>> http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-node
>>>>> -yarn.html
>>>>>> 
>>>>> to
>>>>> set up the cluster.
>>>>> 
>>>>> Help is much appreciated.
>>>>> 
>>>>> Thanks in advance.
>>>>> 
>>>>> --
>>>>> ------------------------------------------
>>>>> Telles Mota Vidal Nobrega
>>>>> M.sc. Candidate at UFCG
>>>>> B.sc. in Computer Science at UFCG
>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
>>>>> 
>>> 
>> 
> 


Re: Running Job on Multinode Yarn Cluster

Posted by Telles Nobrega <te...@gmail.com>.
Hi Chris, now I have the tar file in my RM machine, and the yarn path points to it. I changed the core-site.xml to use HttpFileSystem instead of HDFS now it is failing with

Application application_1407640485281_0001 failed 2 times due to AM Container for appattempt_1407640485281_0001_000002 exited with exitCode:-1000 due to: java.lang.ClassNotFoundException: Class org.apache.samza.util.hadoop.HttpFileSystem not found

I think I can solve this just installing scala files from the samza tutorial, can you confirm that?

On 09 Aug 2014, at 08:34, Telles Nobrega <te...@gmail.com> wrote:

> Hi Chris,
> 
> I think the problem is that I forgot to update the yarn.job.package.
> I will try again to see if it works now.
> 
> I have one more question, how can I stop (command line) the jobs running in my topology, for the experiment that I will run, I need to run the same job in 4 minutes intervals. So I need to kill it, clean the kafka topics and rerun.
> 
> Thanks in advance.
> 
> On 08 Aug 2014, at 12:41, Chris Riccomini <cr...@linkedin.com.INVALID> wrote:
> 
>> Hey Telles,
>> 
>>>> Do I need to have the job folder on each machine in my cluster?
>> 
>> No, you should not need to do this. There are two ways to deploy your
>> tarball to the YARN grid. One is to put it in HDFS, and the other is to
>> put it on an HTTP server. The link to running a Samza job in a multi-node
>> YARN cluster describes how to do both (either HTTP server or HDFS).
>> 
>> In both cases, once the tarball is put in on the HTTP/HDFS server(s), you
>> must update yarn.package.path to point to it. From there, the YARN NM
>> should download it for you automatically when you start your job.
>> 
>> * Can you send along a paste of your job config?
>> 
>> Cheers,
>> Chris
>> 
>> On 8/8/14 8:04 AM, "Claudio Martins" <cl...@mobileaware.com> wrote:
>> 
>>> Hi Telles, it looks to me that you forgot to update the
>>> "yarn.package.path"
>>> attribute in your config file for the task.
>>> 
>>> - Claudio Martins
>>> Head of Engineering
>>> MobileAware USA Inc. / www.mobileaware.com
>>> office: +1  617 986 5060 / mobile: +1 617 480 5288
>>> linkedin: www.linkedin.com/in/martinsclaudio
>>> 
>>> 
>>> On Fri, Aug 8, 2014 at 10:55 AM, Telles Nobrega <te...@gmail.com>
>>> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> this is my first time trying to run a job on a multinode environment. I
>>>> have the cluster set up, I can see in the GUI that all nodes are
>>>> working.
>>>> Do I need to have the job folder on each machine in my cluster?
>>>> - The first time I tried running with the job on the namenode machine
>>>> and
>>>> it failed saying:
>>>> 
>>>> Application application_1407509228798_0001 failed 2 times due to AM
>>>> Container for appattempt_1407509228798_0001_000002 exited with exitCode:
>>>> -1000 due to: File
>>>> 
>>>> 
>>>> file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-package-
>>>> 0.7.0-dist.tar.gz
>>>> does not exist
>>>> 
>>>> So I copied the folder to each machine in my cluster and got this error:
>>>> 
>>>> Application application_1407509228798_0002 failed 2 times due to AM
>>>> Container for appattempt_1407509228798_0002_000002 exited with exitCode:
>>>> -1000 due to: Resource
>>>> 
>>>> 
>>>> file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-package-
>>>> 0.7.0-dist.tar.gz
>>>> changed on src filesystem (expected 1407509168000, was 1407509434000
>>>> 
>>>> What am I missing?
>>>> 
>>>> p.s.: I followed this
>>>> <https://github.com/yahoo/samoa/wiki/Executing-SAMOA-with-Apache-Samza>
>>>> tutorial
>>>> and this
>>>> <
>>>> 
>>>> http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-node
>>>> -yarn.html
>>>>> 
>>>> to
>>>> set up the cluster.
>>>> 
>>>> Help is much appreciated.
>>>> 
>>>> Thanks in advance.
>>>> 
>>>> --
>>>> ------------------------------------------
>>>> Telles Mota Vidal Nobrega
>>>> M.sc. Candidate at UFCG
>>>> B.sc. in Computer Science at UFCG
>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
>>>> 
>> 
> 


Re: Running Job on Multinode Yarn Cluster

Posted by Telles Nobrega <te...@gmail.com>.
Hi Chris,

I think the problem is that I forgot to update the yarn.job.package.
I will try again to see if it works now.

I have one more question, how can I stop (command line) the jobs running in my topology, for the experiment that I will run, I need to run the same job in 4 minutes intervals. So I need to kill it, clean the kafka topics and rerun.

Thanks in advance.

On 08 Aug 2014, at 12:41, Chris Riccomini <cr...@linkedin.com.INVALID> wrote:

> Hey Telles,
> 
>>> Do I need to have the job folder on each machine in my cluster?
> 
> No, you should not need to do this. There are two ways to deploy your
> tarball to the YARN grid. One is to put it in HDFS, and the other is to
> put it on an HTTP server. The link to running a Samza job in a multi-node
> YARN cluster describes how to do both (either HTTP server or HDFS).
> 
> In both cases, once the tarball is put in on the HTTP/HDFS server(s), you
> must update yarn.package.path to point to it. From there, the YARN NM
> should download it for you automatically when you start your job.
> 
> * Can you send along a paste of your job config?
> 
> Cheers,
> Chris
> 
> On 8/8/14 8:04 AM, "Claudio Martins" <cl...@mobileaware.com> wrote:
> 
>> Hi Telles, it looks to me that you forgot to update the
>> "yarn.package.path"
>> attribute in your config file for the task.
>> 
>> - Claudio Martins
>> Head of Engineering
>> MobileAware USA Inc. / www.mobileaware.com
>> office: +1  617 986 5060 / mobile: +1 617 480 5288
>> linkedin: www.linkedin.com/in/martinsclaudio
>> 
>> 
>> On Fri, Aug 8, 2014 at 10:55 AM, Telles Nobrega <te...@gmail.com>
>> wrote:
>> 
>>> Hi,
>>> 
>>> this is my first time trying to run a job on a multinode environment. I
>>> have the cluster set up, I can see in the GUI that all nodes are
>>> working.
>>> Do I need to have the job folder on each machine in my cluster?
>>> - The first time I tried running with the job on the namenode machine
>>> and
>>> it failed saying:
>>> 
>>> Application application_1407509228798_0001 failed 2 times due to AM
>>> Container for appattempt_1407509228798_0001_000002 exited with exitCode:
>>> -1000 due to: File
>>> 
>>> 
>>> file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-package-
>>> 0.7.0-dist.tar.gz
>>> does not exist
>>> 
>>> So I copied the folder to each machine in my cluster and got this error:
>>> 
>>> Application application_1407509228798_0002 failed 2 times due to AM
>>> Container for appattempt_1407509228798_0002_000002 exited with exitCode:
>>> -1000 due to: Resource
>>> 
>>> 
>>> file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-package-
>>> 0.7.0-dist.tar.gz
>>> changed on src filesystem (expected 1407509168000, was 1407509434000
>>> 
>>> What am I missing?
>>> 
>>> p.s.: I followed this
>>> <https://github.com/yahoo/samoa/wiki/Executing-SAMOA-with-Apache-Samza>
>>> tutorial
>>> and this
>>> <
>>> 
>>> http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-node
>>> -yarn.html
>>>> 
>>> to
>>> set up the cluster.
>>> 
>>> Help is much appreciated.
>>> 
>>> Thanks in advance.
>>> 
>>> --
>>> ------------------------------------------
>>> Telles Mota Vidal Nobrega
>>> M.sc. Candidate at UFCG
>>> B.sc. in Computer Science at UFCG
>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
>>> 
> 


Re: Running Job on Multinode Yarn Cluster

Posted by Chris Riccomini <cr...@linkedin.com.INVALID>.
Hey Telles,

>> Do I need to have the job folder on each machine in my cluster?

No, you should not need to do this. There are two ways to deploy your
tarball to the YARN grid. One is to put it in HDFS, and the other is to
put it on an HTTP server. The link to running a Samza job in a multi-node
YARN cluster describes how to do both (either HTTP server or HDFS).

In both cases, once the tarball is put in on the HTTP/HDFS server(s), you
must update yarn.package.path to point to it. From there, the YARN NM
should download it for you automatically when you start your job.

* Can you send along a paste of your job config?

Cheers,
Chris

On 8/8/14 8:04 AM, "Claudio Martins" <cl...@mobileaware.com> wrote:

>Hi Telles, it looks to me that you forgot to update the
>"yarn.package.path"
>attribute in your config file for the task.
>
>- Claudio Martins
>Head of Engineering
>MobileAware USA Inc. / www.mobileaware.com
>office: +1  617 986 5060 / mobile: +1 617 480 5288
>linkedin: www.linkedin.com/in/martinsclaudio
>
>
>On Fri, Aug 8, 2014 at 10:55 AM, Telles Nobrega <te...@gmail.com>
>wrote:
>
>> Hi,
>>
>> this is my first time trying to run a job on a multinode environment. I
>> have the cluster set up, I can see in the GUI that all nodes are
>>working.
>> Do I need to have the job folder on each machine in my cluster?
>>  - The first time I tried running with the job on the namenode machine
>>and
>> it failed saying:
>>
>> Application application_1407509228798_0001 failed 2 times due to AM
>> Container for appattempt_1407509228798_0001_000002 exited with exitCode:
>> -1000 due to: File
>>
>> 
>>file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-package-
>>0.7.0-dist.tar.gz
>> does not exist
>>
>> So I copied the folder to each machine in my cluster and got this error:
>>
>> Application application_1407509228798_0002 failed 2 times due to AM
>> Container for appattempt_1407509228798_0002_000002 exited with exitCode:
>> -1000 due to: Resource
>>
>> 
>>file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-package-
>>0.7.0-dist.tar.gz
>> changed on src filesystem (expected 1407509168000, was 1407509434000
>>
>> What am I missing?
>>
>> p.s.: I followed this
>> <https://github.com/yahoo/samoa/wiki/Executing-SAMOA-with-Apache-Samza>
>> tutorial
>> and this
>> <
>> 
>>http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-node
>>-yarn.html
>> >
>> to
>> set up the cluster.
>>
>> Help is much appreciated.
>>
>> Thanks in advance.
>>
>> --
>> ------------------------------------------
>> Telles Mota Vidal Nobrega
>> M.sc. Candidate at UFCG
>> B.sc. in Computer Science at UFCG
>> Software Engineer at OpenStack Project - HP/LSD-UFCG
>>


Re: Running Job on Multinode Yarn Cluster

Posted by Claudio Martins <cl...@mobileaware.com>.
Hi Telles, it looks to me that you forgot to update the "yarn.package.path"
attribute in your config file for the task.

- Claudio Martins
Head of Engineering
MobileAware USA Inc. / www.mobileaware.com
office: +1  617 986 5060 / mobile: +1 617 480 5288
linkedin: www.linkedin.com/in/martinsclaudio


On Fri, Aug 8, 2014 at 10:55 AM, Telles Nobrega <te...@gmail.com>
wrote:

> Hi,
>
> this is my first time trying to run a job on a multinode environment. I
> have the cluster set up, I can see in the GUI that all nodes are working.
> Do I need to have the job folder on each machine in my cluster?
>  - The first time I tried running with the job on the namenode machine and
> it failed saying:
>
> Application application_1407509228798_0001 failed 2 times due to AM
> Container for appattempt_1407509228798_0001_000002 exited with exitCode:
> -1000 due to: File
>
> file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-package-0.7.0-dist.tar.gz
> does not exist
>
> So I copied the folder to each machine in my cluster and got this error:
>
> Application application_1407509228798_0002 failed 2 times due to AM
> Container for appattempt_1407509228798_0002_000002 exited with exitCode:
> -1000 due to: Resource
>
> file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-package-0.7.0-dist.tar.gz
> changed on src filesystem (expected 1407509168000, was 1407509434000
>
> What am I missing?
>
> p.s.: I followed this
> <https://github.com/yahoo/samoa/wiki/Executing-SAMOA-with-Apache-Samza>
> tutorial
> and this
> <
> http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-node-yarn.html
> >
> to
> set up the cluster.
>
> Help is much appreciated.
>
> Thanks in advance.
>
> --
> ------------------------------------------
> Telles Mota Vidal Nobrega
> M.sc. Candidate at UFCG
> B.sc. in Computer Science at UFCG
> Software Engineer at OpenStack Project - HP/LSD-UFCG
>