You are viewing a plain text version of this content. The canonical link for it is here.

Posted to hdfs-user@hadoop.apache.org by Steve Lewis <lo...@gmail.com> on 2014/04/25 17:46:58 UTC

What configuration parameters cause a Hadoop 2.x job to run on the cluster

Assume I have a machine on the same network as a hadoop 2 cluster but
separate from it.

My understanding is that by setting certain elements of the config file or
local xml files to point to the cluster I can launch a job without having
to log into the cluster, move my jar to hdfs and start the job from the
cluster's hadoop machine.

Does this work?
What Parameters need I sat?
Where is the jar file?
What issues would I see if the machine is running Windows with cygwin
installed?

Re: What configuration parameters cause a Hadoop 2.x job to run on the cluster

Posted by unmesha sreeveni <un...@gmail.com>.

   config.set("fs.defaultFS", "hdfs://port/");
  config.set("hadoop.job.ugi", "hdfs");


On Fri, Apr 25, 2014 at 10:46 PM, Oleg Zhurakousky <
oleg.zhurakousky@gmail.com> wrote:

> Yes, it will be copied since it goes to each job's namesapce
>
>
>
> On Fri, Apr 25, 2014 at 1:14 PM, Steve Lewis <lo...@gmail.com>wrote:
>
>> I am using MR and know the job.setJar command - I can add all
>> dependencies to the jar in the lib directory but I was wondering if Hadoop
>> would copy a jar from my local machine to the cluster - also is I ran
>> multiple jobs with the same jar whether the jar would be copied N times (I
>> typically chain 5 map-reduce jobs
>>
>>
>> On Fri, Apr 25, 2014 at 10:08 AM, Oleg Zhurakousky <
>> oleg.zhurakousky@gmail.com> wrote:
>>
>>> Are you talking about MR or plain YARN application?
>>> In MR you typically use one of the job.setJar* methods. That aside you
>>> may have more then your app JAR (dependencies). So you can copy the
>>> dependencies to all hadoop nodes classpath (e.g., shared dir)
>>>
>>> Oleg
>>>
>>>
>>> On Fri, Apr 25, 2014 at 1:02 PM, Steve Lewis <lo...@gmail.com>wrote:
>>>
>>>> so if I create a Hadoop jar file with referenced libraries in the lib
>>>> directory do I need to move it to hdfs or can it sit on my local machine?
>>>> if I move it to hdfs where does it live - which is to say how do I specify
>>>> the path?
>>>>
>>>>
>>>> On Fri, Apr 25, 2014 at 9:52 AM, Oleg Zhurakousky <
>>>> oleg.zhurakousky@gmail.com> wrote:
>>>>
>>>>> Yes, if you are running MR
>>>>>
>>>>>
>>>>> On Fri, Apr 25, 2014 at 12:48 PM, Steve Lewis <lo...@gmail.com>wrote:
>>>>>
>>>>>> Thank you for your answer
>>>>>>
>>>>>> 1) I am using YARN
>>>>>> 2) So presumably dropping  core-site.xml, yarn-site into user.dir
>>>>>> works do I need mapred-site.xml as well?
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Apr 25, 2014 at 9:00 AM, Oleg Zhurakousky <
>>>>>> oleg.zhurakousky@gmail.com> wrote:
>>>>>>
>>>>>>> What version of Hadoop you are using? (YARN or no YARN)
>>>>>>> To answer your question; Yes its possible and simple. All you need
>>>>>>> to to is to have Hadoop JARs on the classpath with relevant configuration
>>>>>>> files on the same  classpath pointing to the Hadoop cluster. Most often
>>>>>>> people simply copy core-site.xml, yarn-site.xml etc from the actual cluster
>>>>>>> to the application classpath and then you can run it straight from IDE.
>>>>>>>
>>>>>>> Not a windows user so not sure about that second part of the
>>>>>>> question.
>>>>>>>
>>>>>>> Cheers
>>>>>>> Oleg
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Apr 25, 2014 at 11:46 AM, Steve Lewis <lordjoe2000@gmail.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> Assume I have a machine on the same network as a hadoop 2 cluster
>>>>>>>> but separate from it.
>>>>>>>>
>>>>>>>> My understanding is that by setting certain elements of the config
>>>>>>>> file or local xml files to point to the cluster I can launch a job without
>>>>>>>> having to log into the cluster, move my jar to hdfs and start the job from
>>>>>>>> the cluster's hadoop machine.
>>>>>>>>
>>>>>>>> Does this work?
>>>>>>>> What Parameters need I sat?
>>>>>>>> Where is the jar file?
>>>>>>>> What issues would I see if the machine is running Windows with
>>>>>>>> cygwin installed?
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Steven M. Lewis PhD
>>>>>> 4221 105th Ave NE
>>>>>> Kirkland, WA 98033
>>>>>> 206-384-1340 (cell)
>>>>>> Skype lordjoe_com
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Steven M. Lewis PhD
>>>> 4221 105th Ave NE
>>>> Kirkland, WA 98033
>>>> 206-384-1340 (cell)
>>>> Skype lordjoe_com
>>>>
>>>>
>>>
>>
>>
>> --
>> Steven M. Lewis PhD
>> 4221 105th Ave NE
>> Kirkland, WA 98033
>> 206-384-1340 (cell)
>> Skype lordjoe_com
>>
>>
>


-- 
*Thanks & Regards *


*Unmesha Sreeveni U.B*
*Hadoop, Bigdata Developer*
*Center for Cyber Security | Amrita Vishwa Vidyapeetham*
http://www.unmeshasreeveni.blogspot.in/

Re: What configuration parameters cause a Hadoop 2.x job to run on the cluster

Posted by unmesha sreeveni <un...@gmail.com>.

   config.set("fs.defaultFS", "hdfs://port/");
  config.set("hadoop.job.ugi", "hdfs");


On Fri, Apr 25, 2014 at 10:46 PM, Oleg Zhurakousky <
oleg.zhurakousky@gmail.com> wrote:

> Yes, it will be copied since it goes to each job's namesapce
>
>
>
> On Fri, Apr 25, 2014 at 1:14 PM, Steve Lewis <lo...@gmail.com>wrote:
>
>> I am using MR and know the job.setJar command - I can add all
>> dependencies to the jar in the lib directory but I was wondering if Hadoop
>> would copy a jar from my local machine to the cluster - also is I ran
>> multiple jobs with the same jar whether the jar would be copied N times (I
>> typically chain 5 map-reduce jobs
>>
>>
>> On Fri, Apr 25, 2014 at 10:08 AM, Oleg Zhurakousky <
>> oleg.zhurakousky@gmail.com> wrote:
>>
>>> Are you talking about MR or plain YARN application?
>>> In MR you typically use one of the job.setJar* methods. That aside you
>>> may have more then your app JAR (dependencies). So you can copy the
>>> dependencies to all hadoop nodes classpath (e.g., shared dir)
>>>
>>> Oleg
>>>
>>>
>>> On Fri, Apr 25, 2014 at 1:02 PM, Steve Lewis <lo...@gmail.com>wrote:
>>>
>>>> so if I create a Hadoop jar file with referenced libraries in the lib
>>>> directory do I need to move it to hdfs or can it sit on my local machine?
>>>> if I move it to hdfs where does it live - which is to say how do I specify
>>>> the path?
>>>>
>>>>
>>>> On Fri, Apr 25, 2014 at 9:52 AM, Oleg Zhurakousky <
>>>> oleg.zhurakousky@gmail.com> wrote:
>>>>
>>>>> Yes, if you are running MR
>>>>>
>>>>>
>>>>> On Fri, Apr 25, 2014 at 12:48 PM, Steve Lewis <lo...@gmail.com>wrote:
>>>>>
>>>>>> Thank you for your answer
>>>>>>
>>>>>> 1) I am using YARN
>>>>>> 2) So presumably dropping  core-site.xml, yarn-site into user.dir
>>>>>> works do I need mapred-site.xml as well?
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Apr 25, 2014 at 9:00 AM, Oleg Zhurakousky <
>>>>>> oleg.zhurakousky@gmail.com> wrote:
>>>>>>
>>>>>>> What version of Hadoop you are using? (YARN or no YARN)
>>>>>>> To answer your question; Yes its possible and simple. All you need
>>>>>>> to to is to have Hadoop JARs on the classpath with relevant configuration
>>>>>>> files on the same  classpath pointing to the Hadoop cluster. Most often
>>>>>>> people simply copy core-site.xml, yarn-site.xml etc from the actual cluster
>>>>>>> to the application classpath and then you can run it straight from IDE.
>>>>>>>
>>>>>>> Not a windows user so not sure about that second part of the
>>>>>>> question.
>>>>>>>
>>>>>>> Cheers
>>>>>>> Oleg
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Apr 25, 2014 at 11:46 AM, Steve Lewis <lordjoe2000@gmail.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> Assume I have a machine on the same network as a hadoop 2 cluster
>>>>>>>> but separate from it.
>>>>>>>>
>>>>>>>> My understanding is that by setting certain elements of the config
>>>>>>>> file or local xml files to point to the cluster I can launch a job without
>>>>>>>> having to log into the cluster, move my jar to hdfs and start the job from
>>>>>>>> the cluster's hadoop machine.
>>>>>>>>
>>>>>>>> Does this work?
>>>>>>>> What Parameters need I sat?
>>>>>>>> Where is the jar file?
>>>>>>>> What issues would I see if the machine is running Windows with
>>>>>>>> cygwin installed?
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Steven M. Lewis PhD
>>>>>> 4221 105th Ave NE
>>>>>> Kirkland, WA 98033
>>>>>> 206-384-1340 (cell)
>>>>>> Skype lordjoe_com
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Steven M. Lewis PhD
>>>> 4221 105th Ave NE
>>>> Kirkland, WA 98033
>>>> 206-384-1340 (cell)
>>>> Skype lordjoe_com
>>>>
>>>>
>>>
>>
>>
>> --
>> Steven M. Lewis PhD
>> 4221 105th Ave NE
>> Kirkland, WA 98033
>> 206-384-1340 (cell)
>> Skype lordjoe_com
>>
>>
>


-- 
*Thanks & Regards *


*Unmesha Sreeveni U.B*
*Hadoop, Bigdata Developer*
*Center for Cyber Security | Amrita Vishwa Vidyapeetham*
http://www.unmeshasreeveni.blogspot.in/

Re: What configuration parameters cause a Hadoop 2.x job to run on the cluster

Posted by unmesha sreeveni <un...@gmail.com>.

   config.set("fs.defaultFS", "hdfs://port/");
  config.set("hadoop.job.ugi", "hdfs");


On Fri, Apr 25, 2014 at 10:46 PM, Oleg Zhurakousky <
oleg.zhurakousky@gmail.com> wrote:

> Yes, it will be copied since it goes to each job's namesapce
>
>
>
> On Fri, Apr 25, 2014 at 1:14 PM, Steve Lewis <lo...@gmail.com>wrote:
>
>> I am using MR and know the job.setJar command - I can add all
>> dependencies to the jar in the lib directory but I was wondering if Hadoop
>> would copy a jar from my local machine to the cluster - also is I ran
>> multiple jobs with the same jar whether the jar would be copied N times (I
>> typically chain 5 map-reduce jobs
>>
>>
>> On Fri, Apr 25, 2014 at 10:08 AM, Oleg Zhurakousky <
>> oleg.zhurakousky@gmail.com> wrote:
>>
>>> Are you talking about MR or plain YARN application?
>>> In MR you typically use one of the job.setJar* methods. That aside you
>>> may have more then your app JAR (dependencies). So you can copy the
>>> dependencies to all hadoop nodes classpath (e.g., shared dir)
>>>
>>> Oleg
>>>
>>>
>>> On Fri, Apr 25, 2014 at 1:02 PM, Steve Lewis <lo...@gmail.com>wrote:
>>>
>>>> so if I create a Hadoop jar file with referenced libraries in the lib
>>>> directory do I need to move it to hdfs or can it sit on my local machine?
>>>> if I move it to hdfs where does it live - which is to say how do I specify
>>>> the path?
>>>>
>>>>
>>>> On Fri, Apr 25, 2014 at 9:52 AM, Oleg Zhurakousky <
>>>> oleg.zhurakousky@gmail.com> wrote:
>>>>
>>>>> Yes, if you are running MR
>>>>>
>>>>>
>>>>> On Fri, Apr 25, 2014 at 12:48 PM, Steve Lewis <lo...@gmail.com>wrote:
>>>>>
>>>>>> Thank you for your answer
>>>>>>
>>>>>> 1) I am using YARN
>>>>>> 2) So presumably dropping  core-site.xml, yarn-site into user.dir
>>>>>> works do I need mapred-site.xml as well?
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Apr 25, 2014 at 9:00 AM, Oleg Zhurakousky <
>>>>>> oleg.zhurakousky@gmail.com> wrote:
>>>>>>
>>>>>>> What version of Hadoop you are using? (YARN or no YARN)
>>>>>>> To answer your question; Yes its possible and simple. All you need
>>>>>>> to to is to have Hadoop JARs on the classpath with relevant configuration
>>>>>>> files on the same  classpath pointing to the Hadoop cluster. Most often
>>>>>>> people simply copy core-site.xml, yarn-site.xml etc from the actual cluster
>>>>>>> to the application classpath and then you can run it straight from IDE.
>>>>>>>
>>>>>>> Not a windows user so not sure about that second part of the
>>>>>>> question.
>>>>>>>
>>>>>>> Cheers
>>>>>>> Oleg
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Apr 25, 2014 at 11:46 AM, Steve Lewis <lordjoe2000@gmail.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> Assume I have a machine on the same network as a hadoop 2 cluster
>>>>>>>> but separate from it.
>>>>>>>>
>>>>>>>> My understanding is that by setting certain elements of the config
>>>>>>>> file or local xml files to point to the cluster I can launch a job without
>>>>>>>> having to log into the cluster, move my jar to hdfs and start the job from
>>>>>>>> the cluster's hadoop machine.
>>>>>>>>
>>>>>>>> Does this work?
>>>>>>>> What Parameters need I sat?
>>>>>>>> Where is the jar file?
>>>>>>>> What issues would I see if the machine is running Windows with
>>>>>>>> cygwin installed?
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Steven M. Lewis PhD
>>>>>> 4221 105th Ave NE
>>>>>> Kirkland, WA 98033
>>>>>> 206-384-1340 (cell)
>>>>>> Skype lordjoe_com
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Steven M. Lewis PhD
>>>> 4221 105th Ave NE
>>>> Kirkland, WA 98033
>>>> 206-384-1340 (cell)
>>>> Skype lordjoe_com
>>>>
>>>>
>>>
>>
>>
>> --
>> Steven M. Lewis PhD
>> 4221 105th Ave NE
>> Kirkland, WA 98033
>> 206-384-1340 (cell)
>> Skype lordjoe_com
>>
>>
>


-- 
*Thanks & Regards *


*Unmesha Sreeveni U.B*
*Hadoop, Bigdata Developer*
*Center for Cyber Security | Amrita Vishwa Vidyapeetham*
http://www.unmeshasreeveni.blogspot.in/

Re: What configuration parameters cause a Hadoop 2.x job to run on the cluster

Posted by unmesha sreeveni <un...@gmail.com>.

   config.set("fs.defaultFS", "hdfs://port/");
  config.set("hadoop.job.ugi", "hdfs");


On Fri, Apr 25, 2014 at 10:46 PM, Oleg Zhurakousky <
oleg.zhurakousky@gmail.com> wrote:

> Yes, it will be copied since it goes to each job's namesapce
>
>
>
> On Fri, Apr 25, 2014 at 1:14 PM, Steve Lewis <lo...@gmail.com>wrote:
>
>> I am using MR and know the job.setJar command - I can add all
>> dependencies to the jar in the lib directory but I was wondering if Hadoop
>> would copy a jar from my local machine to the cluster - also is I ran
>> multiple jobs with the same jar whether the jar would be copied N times (I
>> typically chain 5 map-reduce jobs
>>
>>
>> On Fri, Apr 25, 2014 at 10:08 AM, Oleg Zhurakousky <
>> oleg.zhurakousky@gmail.com> wrote:
>>
>>> Are you talking about MR or plain YARN application?
>>> In MR you typically use one of the job.setJar* methods. That aside you
>>> may have more then your app JAR (dependencies). So you can copy the
>>> dependencies to all hadoop nodes classpath (e.g., shared dir)
>>>
>>> Oleg
>>>
>>>
>>> On Fri, Apr 25, 2014 at 1:02 PM, Steve Lewis <lo...@gmail.com>wrote:
>>>
>>>> so if I create a Hadoop jar file with referenced libraries in the lib
>>>> directory do I need to move it to hdfs or can it sit on my local machine?
>>>> if I move it to hdfs where does it live - which is to say how do I specify
>>>> the path?
>>>>
>>>>
>>>> On Fri, Apr 25, 2014 at 9:52 AM, Oleg Zhurakousky <
>>>> oleg.zhurakousky@gmail.com> wrote:
>>>>
>>>>> Yes, if you are running MR
>>>>>
>>>>>
>>>>> On Fri, Apr 25, 2014 at 12:48 PM, Steve Lewis <lo...@gmail.com>wrote:
>>>>>
>>>>>> Thank you for your answer
>>>>>>
>>>>>> 1) I am using YARN
>>>>>> 2) So presumably dropping  core-site.xml, yarn-site into user.dir
>>>>>> works do I need mapred-site.xml as well?
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Apr 25, 2014 at 9:00 AM, Oleg Zhurakousky <
>>>>>> oleg.zhurakousky@gmail.com> wrote:
>>>>>>
>>>>>>> What version of Hadoop you are using? (YARN or no YARN)
>>>>>>> To answer your question; Yes its possible and simple. All you need
>>>>>>> to to is to have Hadoop JARs on the classpath with relevant configuration
>>>>>>> files on the same  classpath pointing to the Hadoop cluster. Most often
>>>>>>> people simply copy core-site.xml, yarn-site.xml etc from the actual cluster
>>>>>>> to the application classpath and then you can run it straight from IDE.
>>>>>>>
>>>>>>> Not a windows user so not sure about that second part of the
>>>>>>> question.
>>>>>>>
>>>>>>> Cheers
>>>>>>> Oleg
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Apr 25, 2014 at 11:46 AM, Steve Lewis <lordjoe2000@gmail.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> Assume I have a machine on the same network as a hadoop 2 cluster
>>>>>>>> but separate from it.
>>>>>>>>
>>>>>>>> My understanding is that by setting certain elements of the config
>>>>>>>> file or local xml files to point to the cluster I can launch a job without
>>>>>>>> having to log into the cluster, move my jar to hdfs and start the job from
>>>>>>>> the cluster's hadoop machine.
>>>>>>>>
>>>>>>>> Does this work?
>>>>>>>> What Parameters need I sat?
>>>>>>>> Where is the jar file?
>>>>>>>> What issues would I see if the machine is running Windows with
>>>>>>>> cygwin installed?
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Steven M. Lewis PhD
>>>>>> 4221 105th Ave NE
>>>>>> Kirkland, WA 98033
>>>>>> 206-384-1340 (cell)
>>>>>> Skype lordjoe_com
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Steven M. Lewis PhD
>>>> 4221 105th Ave NE
>>>> Kirkland, WA 98033
>>>> 206-384-1340 (cell)
>>>> Skype lordjoe_com
>>>>
>>>>
>>>
>>
>>
>> --
>> Steven M. Lewis PhD
>> 4221 105th Ave NE
>> Kirkland, WA 98033
>> 206-384-1340 (cell)
>> Skype lordjoe_com
>>
>>
>


-- 
*Thanks & Regards *


*Unmesha Sreeveni U.B*
*Hadoop, Bigdata Developer*
*Center for Cyber Security | Amrita Vishwa Vidyapeetham*
http://www.unmeshasreeveni.blogspot.in/

Re: What configuration parameters cause a Hadoop 2.x job to run on the cluster

Posted by Oleg Zhurakousky <ol...@gmail.com>.

Yes, it will be copied since it goes to each job's namesapce



On Fri, Apr 25, 2014 at 1:14 PM, Steve Lewis <lo...@gmail.com> wrote:

> I am using MR and know the job.setJar command - I can add all dependencies
> to the jar in the lib directory but I was wondering if Hadoop would copy a
> jar from my local machine to the cluster - also is I ran multiple jobs with
> the same jar whether the jar would be copied N times (I typically chain 5
> map-reduce jobs
>
>
> On Fri, Apr 25, 2014 at 10:08 AM, Oleg Zhurakousky <
> oleg.zhurakousky@gmail.com> wrote:
>
>> Are you talking about MR or plain YARN application?
>> In MR you typically use one of the job.setJar* methods. That aside you
>> may have more then your app JAR (dependencies). So you can copy the
>> dependencies to all hadoop nodes classpath (e.g., shared dir)
>>
>> Oleg
>>
>>
>> On Fri, Apr 25, 2014 at 1:02 PM, Steve Lewis <lo...@gmail.com>wrote:
>>
>>> so if I create a Hadoop jar file with referenced libraries in the lib
>>> directory do I need to move it to hdfs or can it sit on my local machine?
>>> if I move it to hdfs where does it live - which is to say how do I specify
>>> the path?
>>>
>>>
>>> On Fri, Apr 25, 2014 at 9:52 AM, Oleg Zhurakousky <
>>> oleg.zhurakousky@gmail.com> wrote:
>>>
>>>> Yes, if you are running MR
>>>>
>>>>
>>>> On Fri, Apr 25, 2014 at 12:48 PM, Steve Lewis <lo...@gmail.com>wrote:
>>>>
>>>>> Thank you for your answer
>>>>>
>>>>> 1) I am using YARN
>>>>> 2) So presumably dropping  core-site.xml, yarn-site into user.dir
>>>>> works do I need mapred-site.xml as well?
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Apr 25, 2014 at 9:00 AM, Oleg Zhurakousky <
>>>>> oleg.zhurakousky@gmail.com> wrote:
>>>>>
>>>>>> What version of Hadoop you are using? (YARN or no YARN)
>>>>>> To answer your question; Yes its possible and simple. All you need to
>>>>>> to is to have Hadoop JARs on the classpath with relevant configuration
>>>>>> files on the same  classpath pointing to the Hadoop cluster. Most often
>>>>>> people simply copy core-site.xml, yarn-site.xml etc from the actual cluster
>>>>>> to the application classpath and then you can run it straight from IDE.
>>>>>>
>>>>>> Not a windows user so not sure about that second part of the question.
>>>>>>
>>>>>> Cheers
>>>>>> Oleg
>>>>>>
>>>>>>
>>>>>> On Fri, Apr 25, 2014 at 11:46 AM, Steve Lewis <lo...@gmail.com>wrote:
>>>>>>
>>>>>>> Assume I have a machine on the same network as a hadoop 2 cluster
>>>>>>> but separate from it.
>>>>>>>
>>>>>>> My understanding is that by setting certain elements of the config
>>>>>>> file or local xml files to point to the cluster I can launch a job without
>>>>>>> having to log into the cluster, move my jar to hdfs and start the job from
>>>>>>> the cluster's hadoop machine.
>>>>>>>
>>>>>>> Does this work?
>>>>>>> What Parameters need I sat?
>>>>>>> Where is the jar file?
>>>>>>> What issues would I see if the machine is running Windows with
>>>>>>> cygwin installed?
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Steven M. Lewis PhD
>>>>> 4221 105th Ave NE
>>>>> Kirkland, WA 98033
>>>>> 206-384-1340 (cell)
>>>>> Skype lordjoe_com
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Steven M. Lewis PhD
>>> 4221 105th Ave NE
>>> Kirkland, WA 98033
>>> 206-384-1340 (cell)
>>> Skype lordjoe_com
>>>
>>>
>>
>
>
> --
> Steven M. Lewis PhD
> 4221 105th Ave NE
> Kirkland, WA 98033
> 206-384-1340 (cell)
> Skype lordjoe_com
>
>

Re: What configuration parameters cause a Hadoop 2.x job to run on the cluster

Posted by Oleg Zhurakousky <ol...@gmail.com>.

Yes, it will be copied since it goes to each job's namesapce



On Fri, Apr 25, 2014 at 1:14 PM, Steve Lewis <lo...@gmail.com> wrote:

> I am using MR and know the job.setJar command - I can add all dependencies
> to the jar in the lib directory but I was wondering if Hadoop would copy a
> jar from my local machine to the cluster - also is I ran multiple jobs with
> the same jar whether the jar would be copied N times (I typically chain 5
> map-reduce jobs
>
>
> On Fri, Apr 25, 2014 at 10:08 AM, Oleg Zhurakousky <
> oleg.zhurakousky@gmail.com> wrote:
>
>> Are you talking about MR or plain YARN application?
>> In MR you typically use one of the job.setJar* methods. That aside you
>> may have more then your app JAR (dependencies). So you can copy the
>> dependencies to all hadoop nodes classpath (e.g., shared dir)
>>
>> Oleg
>>
>>
>> On Fri, Apr 25, 2014 at 1:02 PM, Steve Lewis <lo...@gmail.com>wrote:
>>
>>> so if I create a Hadoop jar file with referenced libraries in the lib
>>> directory do I need to move it to hdfs or can it sit on my local machine?
>>> if I move it to hdfs where does it live - which is to say how do I specify
>>> the path?
>>>
>>>
>>> On Fri, Apr 25, 2014 at 9:52 AM, Oleg Zhurakousky <
>>> oleg.zhurakousky@gmail.com> wrote:
>>>
>>>> Yes, if you are running MR
>>>>
>>>>
>>>> On Fri, Apr 25, 2014 at 12:48 PM, Steve Lewis <lo...@gmail.com>wrote:
>>>>
>>>>> Thank you for your answer
>>>>>
>>>>> 1) I am using YARN
>>>>> 2) So presumably dropping  core-site.xml, yarn-site into user.dir
>>>>> works do I need mapred-site.xml as well?
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Apr 25, 2014 at 9:00 AM, Oleg Zhurakousky <
>>>>> oleg.zhurakousky@gmail.com> wrote:
>>>>>
>>>>>> What version of Hadoop you are using? (YARN or no YARN)
>>>>>> To answer your question; Yes its possible and simple. All you need to
>>>>>> to is to have Hadoop JARs on the classpath with relevant configuration
>>>>>> files on the same  classpath pointing to the Hadoop cluster. Most often
>>>>>> people simply copy core-site.xml, yarn-site.xml etc from the actual cluster
>>>>>> to the application classpath and then you can run it straight from IDE.
>>>>>>
>>>>>> Not a windows user so not sure about that second part of the question.
>>>>>>
>>>>>> Cheers
>>>>>> Oleg
>>>>>>
>>>>>>
>>>>>> On Fri, Apr 25, 2014 at 11:46 AM, Steve Lewis <lo...@gmail.com>wrote:
>>>>>>
>>>>>>> Assume I have a machine on the same network as a hadoop 2 cluster
>>>>>>> but separate from it.
>>>>>>>
>>>>>>> My understanding is that by setting certain elements of the config
>>>>>>> file or local xml files to point to the cluster I can launch a job without
>>>>>>> having to log into the cluster, move my jar to hdfs and start the job from
>>>>>>> the cluster's hadoop machine.
>>>>>>>
>>>>>>> Does this work?
>>>>>>> What Parameters need I sat?
>>>>>>> Where is the jar file?
>>>>>>> What issues would I see if the machine is running Windows with
>>>>>>> cygwin installed?
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Steven M. Lewis PhD
>>>>> 4221 105th Ave NE
>>>>> Kirkland, WA 98033
>>>>> 206-384-1340 (cell)
>>>>> Skype lordjoe_com
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Steven M. Lewis PhD
>>> 4221 105th Ave NE
>>> Kirkland, WA 98033
>>> 206-384-1340 (cell)
>>> Skype lordjoe_com
>>>
>>>
>>
>
>
> --
> Steven M. Lewis PhD
> 4221 105th Ave NE
> Kirkland, WA 98033
> 206-384-1340 (cell)
> Skype lordjoe_com
>
>

Re: What configuration parameters cause a Hadoop 2.x job to run on the cluster

Posted by Oleg Zhurakousky <ol...@gmail.com>.

Yes, it will be copied since it goes to each job's namesapce



On Fri, Apr 25, 2014 at 1:14 PM, Steve Lewis <lo...@gmail.com> wrote:

> I am using MR and know the job.setJar command - I can add all dependencies
> to the jar in the lib directory but I was wondering if Hadoop would copy a
> jar from my local machine to the cluster - also is I ran multiple jobs with
> the same jar whether the jar would be copied N times (I typically chain 5
> map-reduce jobs
>
>
> On Fri, Apr 25, 2014 at 10:08 AM, Oleg Zhurakousky <
> oleg.zhurakousky@gmail.com> wrote:
>
>> Are you talking about MR or plain YARN application?
>> In MR you typically use one of the job.setJar* methods. That aside you
>> may have more then your app JAR (dependencies). So you can copy the
>> dependencies to all hadoop nodes classpath (e.g., shared dir)
>>
>> Oleg
>>
>>
>> On Fri, Apr 25, 2014 at 1:02 PM, Steve Lewis <lo...@gmail.com>wrote:
>>
>>> so if I create a Hadoop jar file with referenced libraries in the lib
>>> directory do I need to move it to hdfs or can it sit on my local machine?
>>> if I move it to hdfs where does it live - which is to say how do I specify
>>> the path?
>>>
>>>
>>> On Fri, Apr 25, 2014 at 9:52 AM, Oleg Zhurakousky <
>>> oleg.zhurakousky@gmail.com> wrote:
>>>
>>>> Yes, if you are running MR
>>>>
>>>>
>>>> On Fri, Apr 25, 2014 at 12:48 PM, Steve Lewis <lo...@gmail.com>wrote:
>>>>
>>>>> Thank you for your answer
>>>>>
>>>>> 1) I am using YARN
>>>>> 2) So presumably dropping  core-site.xml, yarn-site into user.dir
>>>>> works do I need mapred-site.xml as well?
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Apr 25, 2014 at 9:00 AM, Oleg Zhurakousky <
>>>>> oleg.zhurakousky@gmail.com> wrote:
>>>>>
>>>>>> What version of Hadoop you are using? (YARN or no YARN)
>>>>>> To answer your question; Yes its possible and simple. All you need to
>>>>>> to is to have Hadoop JARs on the classpath with relevant configuration
>>>>>> files on the same  classpath pointing to the Hadoop cluster. Most often
>>>>>> people simply copy core-site.xml, yarn-site.xml etc from the actual cluster
>>>>>> to the application classpath and then you can run it straight from IDE.
>>>>>>
>>>>>> Not a windows user so not sure about that second part of the question.
>>>>>>
>>>>>> Cheers
>>>>>> Oleg
>>>>>>
>>>>>>
>>>>>> On Fri, Apr 25, 2014 at 11:46 AM, Steve Lewis <lo...@gmail.com>wrote:
>>>>>>
>>>>>>> Assume I have a machine on the same network as a hadoop 2 cluster
>>>>>>> but separate from it.
>>>>>>>
>>>>>>> My understanding is that by setting certain elements of the config
>>>>>>> file or local xml files to point to the cluster I can launch a job without
>>>>>>> having to log into the cluster, move my jar to hdfs and start the job from
>>>>>>> the cluster's hadoop machine.
>>>>>>>
>>>>>>> Does this work?
>>>>>>> What Parameters need I sat?
>>>>>>> Where is the jar file?
>>>>>>> What issues would I see if the machine is running Windows with
>>>>>>> cygwin installed?
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Steven M. Lewis PhD
>>>>> 4221 105th Ave NE
>>>>> Kirkland, WA 98033
>>>>> 206-384-1340 (cell)
>>>>> Skype lordjoe_com
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Steven M. Lewis PhD
>>> 4221 105th Ave NE
>>> Kirkland, WA 98033
>>> 206-384-1340 (cell)
>>> Skype lordjoe_com
>>>
>>>
>>
>
>
> --
> Steven M. Lewis PhD
> 4221 105th Ave NE
> Kirkland, WA 98033
> 206-384-1340 (cell)
> Skype lordjoe_com
>
>

Re: What configuration parameters cause a Hadoop 2.x job to run on the cluster

Posted by Oleg Zhurakousky <ol...@gmail.com>.

Yes, it will be copied since it goes to each job's namesapce



On Fri, Apr 25, 2014 at 1:14 PM, Steve Lewis <lo...@gmail.com> wrote:

> I am using MR and know the job.setJar command - I can add all dependencies
> to the jar in the lib directory but I was wondering if Hadoop would copy a
> jar from my local machine to the cluster - also is I ran multiple jobs with
> the same jar whether the jar would be copied N times (I typically chain 5
> map-reduce jobs
>
>
> On Fri, Apr 25, 2014 at 10:08 AM, Oleg Zhurakousky <
> oleg.zhurakousky@gmail.com> wrote:
>
>> Are you talking about MR or plain YARN application?
>> In MR you typically use one of the job.setJar* methods. That aside you
>> may have more then your app JAR (dependencies). So you can copy the
>> dependencies to all hadoop nodes classpath (e.g., shared dir)
>>
>> Oleg
>>
>>
>> On Fri, Apr 25, 2014 at 1:02 PM, Steve Lewis <lo...@gmail.com>wrote:
>>
>>> so if I create a Hadoop jar file with referenced libraries in the lib
>>> directory do I need to move it to hdfs or can it sit on my local machine?
>>> if I move it to hdfs where does it live - which is to say how do I specify
>>> the path?
>>>
>>>
>>> On Fri, Apr 25, 2014 at 9:52 AM, Oleg Zhurakousky <
>>> oleg.zhurakousky@gmail.com> wrote:
>>>
>>>> Yes, if you are running MR
>>>>
>>>>
>>>> On Fri, Apr 25, 2014 at 12:48 PM, Steve Lewis <lo...@gmail.com>wrote:
>>>>
>>>>> Thank you for your answer
>>>>>
>>>>> 1) I am using YARN
>>>>> 2) So presumably dropping  core-site.xml, yarn-site into user.dir
>>>>> works do I need mapred-site.xml as well?
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Apr 25, 2014 at 9:00 AM, Oleg Zhurakousky <
>>>>> oleg.zhurakousky@gmail.com> wrote:
>>>>>
>>>>>> What version of Hadoop you are using? (YARN or no YARN)
>>>>>> To answer your question; Yes its possible and simple. All you need to
>>>>>> to is to have Hadoop JARs on the classpath with relevant configuration
>>>>>> files on the same  classpath pointing to the Hadoop cluster. Most often
>>>>>> people simply copy core-site.xml, yarn-site.xml etc from the actual cluster
>>>>>> to the application classpath and then you can run it straight from IDE.
>>>>>>
>>>>>> Not a windows user so not sure about that second part of the question.
>>>>>>
>>>>>> Cheers
>>>>>> Oleg
>>>>>>
>>>>>>
>>>>>> On Fri, Apr 25, 2014 at 11:46 AM, Steve Lewis <lo...@gmail.com>wrote:
>>>>>>
>>>>>>> Assume I have a machine on the same network as a hadoop 2 cluster
>>>>>>> but separate from it.
>>>>>>>
>>>>>>> My understanding is that by setting certain elements of the config
>>>>>>> file or local xml files to point to the cluster I can launch a job without
>>>>>>> having to log into the cluster, move my jar to hdfs and start the job from
>>>>>>> the cluster's hadoop machine.
>>>>>>>
>>>>>>> Does this work?
>>>>>>> What Parameters need I sat?
>>>>>>> Where is the jar file?
>>>>>>> What issues would I see if the machine is running Windows with
>>>>>>> cygwin installed?
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Steven M. Lewis PhD
>>>>> 4221 105th Ave NE
>>>>> Kirkland, WA 98033
>>>>> 206-384-1340 (cell)
>>>>> Skype lordjoe_com
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Steven M. Lewis PhD
>>> 4221 105th Ave NE
>>> Kirkland, WA 98033
>>> 206-384-1340 (cell)
>>> Skype lordjoe_com
>>>
>>>
>>
>
>
> --
> Steven M. Lewis PhD
> 4221 105th Ave NE
> Kirkland, WA 98033
> 206-384-1340 (cell)
> Skype lordjoe_com
>
>

Re: What configuration parameters cause a Hadoop 2.x job to run on the cluster

Posted by Steve Lewis <lo...@gmail.com>.

I am using MR and know the job.setJar command - I can add all dependencies
to the jar in the lib directory but I was wondering if Hadoop would copy a
jar from my local machine to the cluster - also is I ran multiple jobs with
the same jar whether the jar would be copied N times (I typically chain 5
map-reduce jobs


On Fri, Apr 25, 2014 at 10:08 AM, Oleg Zhurakousky <
oleg.zhurakousky@gmail.com> wrote:

> Are you talking about MR or plain YARN application?
> In MR you typically use one of the job.setJar* methods. That aside you may
> have more then your app JAR (dependencies). So you can copy the
> dependencies to all hadoop nodes classpath (e.g., shared dir)
>
> Oleg
>
>
> On Fri, Apr 25, 2014 at 1:02 PM, Steve Lewis <lo...@gmail.com>wrote:
>
>> so if I create a Hadoop jar file with referenced libraries in the lib
>> directory do I need to move it to hdfs or can it sit on my local machine?
>> if I move it to hdfs where does it live - which is to say how do I specify
>> the path?
>>
>>
>> On Fri, Apr 25, 2014 at 9:52 AM, Oleg Zhurakousky <
>> oleg.zhurakousky@gmail.com> wrote:
>>
>>> Yes, if you are running MR
>>>
>>>
>>> On Fri, Apr 25, 2014 at 12:48 PM, Steve Lewis <lo...@gmail.com>wrote:
>>>
>>>> Thank you for your answer
>>>>
>>>> 1) I am using YARN
>>>> 2) So presumably dropping  core-site.xml, yarn-site into user.dir
>>>> works do I need mapred-site.xml as well?
>>>>
>>>>
>>>>
>>>> On Fri, Apr 25, 2014 at 9:00 AM, Oleg Zhurakousky <
>>>> oleg.zhurakousky@gmail.com> wrote:
>>>>
>>>>> What version of Hadoop you are using? (YARN or no YARN)
>>>>> To answer your question; Yes its possible and simple. All you need to
>>>>> to is to have Hadoop JARs on the classpath with relevant configuration
>>>>> files on the same  classpath pointing to the Hadoop cluster. Most often
>>>>> people simply copy core-site.xml, yarn-site.xml etc from the actual cluster
>>>>> to the application classpath and then you can run it straight from IDE.
>>>>>
>>>>> Not a windows user so not sure about that second part of the question.
>>>>>
>>>>> Cheers
>>>>> Oleg
>>>>>
>>>>>
>>>>> On Fri, Apr 25, 2014 at 11:46 AM, Steve Lewis <lo...@gmail.com>wrote:
>>>>>
>>>>>> Assume I have a machine on the same network as a hadoop 2 cluster but
>>>>>> separate from it.
>>>>>>
>>>>>> My understanding is that by setting certain elements of the config
>>>>>> file or local xml files to point to the cluster I can launch a job without
>>>>>> having to log into the cluster, move my jar to hdfs and start the job from
>>>>>> the cluster's hadoop machine.
>>>>>>
>>>>>> Does this work?
>>>>>> What Parameters need I sat?
>>>>>> Where is the jar file?
>>>>>> What issues would I see if the machine is running Windows with cygwin
>>>>>> installed?
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Steven M. Lewis PhD
>>>> 4221 105th Ave NE
>>>> Kirkland, WA 98033
>>>> 206-384-1340 (cell)
>>>> Skype lordjoe_com
>>>>
>>>>
>>>
>>
>>
>> --
>> Steven M. Lewis PhD
>> 4221 105th Ave NE
>> Kirkland, WA 98033
>> 206-384-1340 (cell)
>> Skype lordjoe_com
>>
>>
>


-- 
Steven M. Lewis PhD
4221 105th Ave NE
Kirkland, WA 98033
206-384-1340 (cell)
Skype lordjoe_com

Re: What configuration parameters cause a Hadoop 2.x job to run on the cluster

Posted by Steve Lewis <lo...@gmail.com>.

I am using MR and know the job.setJar command - I can add all dependencies
to the jar in the lib directory but I was wondering if Hadoop would copy a
jar from my local machine to the cluster - also is I ran multiple jobs with
the same jar whether the jar would be copied N times (I typically chain 5
map-reduce jobs


On Fri, Apr 25, 2014 at 10:08 AM, Oleg Zhurakousky <
oleg.zhurakousky@gmail.com> wrote:

> Are you talking about MR or plain YARN application?
> In MR you typically use one of the job.setJar* methods. That aside you may
> have more then your app JAR (dependencies). So you can copy the
> dependencies to all hadoop nodes classpath (e.g., shared dir)
>
> Oleg
>
>
> On Fri, Apr 25, 2014 at 1:02 PM, Steve Lewis <lo...@gmail.com>wrote:
>
>> so if I create a Hadoop jar file with referenced libraries in the lib
>> directory do I need to move it to hdfs or can it sit on my local machine?
>> if I move it to hdfs where does it live - which is to say how do I specify
>> the path?
>>
>>
>> On Fri, Apr 25, 2014 at 9:52 AM, Oleg Zhurakousky <
>> oleg.zhurakousky@gmail.com> wrote:
>>
>>> Yes, if you are running MR
>>>
>>>
>>> On Fri, Apr 25, 2014 at 12:48 PM, Steve Lewis <lo...@gmail.com>wrote:
>>>
>>>> Thank you for your answer
>>>>
>>>> 1) I am using YARN
>>>> 2) So presumably dropping  core-site.xml, yarn-site into user.dir
>>>> works do I need mapred-site.xml as well?
>>>>
>>>>
>>>>
>>>> On Fri, Apr 25, 2014 at 9:00 AM, Oleg Zhurakousky <
>>>> oleg.zhurakousky@gmail.com> wrote:
>>>>
>>>>> What version of Hadoop you are using? (YARN or no YARN)
>>>>> To answer your question; Yes its possible and simple. All you need to
>>>>> to is to have Hadoop JARs on the classpath with relevant configuration
>>>>> files on the same  classpath pointing to the Hadoop cluster. Most often
>>>>> people simply copy core-site.xml, yarn-site.xml etc from the actual cluster
>>>>> to the application classpath and then you can run it straight from IDE.
>>>>>
>>>>> Not a windows user so not sure about that second part of the question.
>>>>>
>>>>> Cheers
>>>>> Oleg
>>>>>
>>>>>
>>>>> On Fri, Apr 25, 2014 at 11:46 AM, Steve Lewis <lo...@gmail.com>wrote:
>>>>>
>>>>>> Assume I have a machine on the same network as a hadoop 2 cluster but
>>>>>> separate from it.
>>>>>>
>>>>>> My understanding is that by setting certain elements of the config
>>>>>> file or local xml files to point to the cluster I can launch a job without
>>>>>> having to log into the cluster, move my jar to hdfs and start the job from
>>>>>> the cluster's hadoop machine.
>>>>>>
>>>>>> Does this work?
>>>>>> What Parameters need I sat?
>>>>>> Where is the jar file?
>>>>>> What issues would I see if the machine is running Windows with cygwin
>>>>>> installed?
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Steven M. Lewis PhD
>>>> 4221 105th Ave NE
>>>> Kirkland, WA 98033
>>>> 206-384-1340 (cell)
>>>> Skype lordjoe_com
>>>>
>>>>
>>>
>>
>>
>> --
>> Steven M. Lewis PhD
>> 4221 105th Ave NE
>> Kirkland, WA 98033
>> 206-384-1340 (cell)
>> Skype lordjoe_com
>>
>>
>


-- 
Steven M. Lewis PhD
4221 105th Ave NE
Kirkland, WA 98033
206-384-1340 (cell)
Skype lordjoe_com

Re: What configuration parameters cause a Hadoop 2.x job to run on the cluster

Posted by Steve Lewis <lo...@gmail.com>.

I am using MR and know the job.setJar command - I can add all dependencies
to the jar in the lib directory but I was wondering if Hadoop would copy a
jar from my local machine to the cluster - also is I ran multiple jobs with
the same jar whether the jar would be copied N times (I typically chain 5
map-reduce jobs


On Fri, Apr 25, 2014 at 10:08 AM, Oleg Zhurakousky <
oleg.zhurakousky@gmail.com> wrote:

> Are you talking about MR or plain YARN application?
> In MR you typically use one of the job.setJar* methods. That aside you may
> have more then your app JAR (dependencies). So you can copy the
> dependencies to all hadoop nodes classpath (e.g., shared dir)
>
> Oleg
>
>
> On Fri, Apr 25, 2014 at 1:02 PM, Steve Lewis <lo...@gmail.com>wrote:
>
>> so if I create a Hadoop jar file with referenced libraries in the lib
>> directory do I need to move it to hdfs or can it sit on my local machine?
>> if I move it to hdfs where does it live - which is to say how do I specify
>> the path?
>>
>>
>> On Fri, Apr 25, 2014 at 9:52 AM, Oleg Zhurakousky <
>> oleg.zhurakousky@gmail.com> wrote:
>>
>>> Yes, if you are running MR
>>>
>>>
>>> On Fri, Apr 25, 2014 at 12:48 PM, Steve Lewis <lo...@gmail.com>wrote:
>>>
>>>> Thank you for your answer
>>>>
>>>> 1) I am using YARN
>>>> 2) So presumably dropping  core-site.xml, yarn-site into user.dir
>>>> works do I need mapred-site.xml as well?
>>>>
>>>>
>>>>
>>>> On Fri, Apr 25, 2014 at 9:00 AM, Oleg Zhurakousky <
>>>> oleg.zhurakousky@gmail.com> wrote:
>>>>
>>>>> What version of Hadoop you are using? (YARN or no YARN)
>>>>> To answer your question; Yes its possible and simple. All you need to
>>>>> to is to have Hadoop JARs on the classpath with relevant configuration
>>>>> files on the same  classpath pointing to the Hadoop cluster. Most often
>>>>> people simply copy core-site.xml, yarn-site.xml etc from the actual cluster
>>>>> to the application classpath and then you can run it straight from IDE.
>>>>>
>>>>> Not a windows user so not sure about that second part of the question.
>>>>>
>>>>> Cheers
>>>>> Oleg
>>>>>
>>>>>
>>>>> On Fri, Apr 25, 2014 at 11:46 AM, Steve Lewis <lo...@gmail.com>wrote:
>>>>>
>>>>>> Assume I have a machine on the same network as a hadoop 2 cluster but
>>>>>> separate from it.
>>>>>>
>>>>>> My understanding is that by setting certain elements of the config
>>>>>> file or local xml files to point to the cluster I can launch a job without
>>>>>> having to log into the cluster, move my jar to hdfs and start the job from
>>>>>> the cluster's hadoop machine.
>>>>>>
>>>>>> Does this work?
>>>>>> What Parameters need I sat?
>>>>>> Where is the jar file?
>>>>>> What issues would I see if the machine is running Windows with cygwin
>>>>>> installed?
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Steven M. Lewis PhD
>>>> 4221 105th Ave NE
>>>> Kirkland, WA 98033
>>>> 206-384-1340 (cell)
>>>> Skype lordjoe_com
>>>>
>>>>
>>>
>>
>>
>> --
>> Steven M. Lewis PhD
>> 4221 105th Ave NE
>> Kirkland, WA 98033
>> 206-384-1340 (cell)
>> Skype lordjoe_com
>>
>>
>


-- 
Steven M. Lewis PhD
4221 105th Ave NE
Kirkland, WA 98033
206-384-1340 (cell)
Skype lordjoe_com

Re: What configuration parameters cause a Hadoop 2.x job to run on the cluster

Posted by Steve Lewis <lo...@gmail.com>.

I am using MR and know the job.setJar command - I can add all dependencies
to the jar in the lib directory but I was wondering if Hadoop would copy a
jar from my local machine to the cluster - also is I ran multiple jobs with
the same jar whether the jar would be copied N times (I typically chain 5
map-reduce jobs


On Fri, Apr 25, 2014 at 10:08 AM, Oleg Zhurakousky <
oleg.zhurakousky@gmail.com> wrote:

> Are you talking about MR or plain YARN application?
> In MR you typically use one of the job.setJar* methods. That aside you may
> have more then your app JAR (dependencies). So you can copy the
> dependencies to all hadoop nodes classpath (e.g., shared dir)
>
> Oleg
>
>
> On Fri, Apr 25, 2014 at 1:02 PM, Steve Lewis <lo...@gmail.com>wrote:
>
>> so if I create a Hadoop jar file with referenced libraries in the lib
>> directory do I need to move it to hdfs or can it sit on my local machine?
>> if I move it to hdfs where does it live - which is to say how do I specify
>> the path?
>>
>>
>> On Fri, Apr 25, 2014 at 9:52 AM, Oleg Zhurakousky <
>> oleg.zhurakousky@gmail.com> wrote:
>>
>>> Yes, if you are running MR
>>>
>>>
>>> On Fri, Apr 25, 2014 at 12:48 PM, Steve Lewis <lo...@gmail.com>wrote:
>>>
>>>> Thank you for your answer
>>>>
>>>> 1) I am using YARN
>>>> 2) So presumably dropping  core-site.xml, yarn-site into user.dir
>>>> works do I need mapred-site.xml as well?
>>>>
>>>>
>>>>
>>>> On Fri, Apr 25, 2014 at 9:00 AM, Oleg Zhurakousky <
>>>> oleg.zhurakousky@gmail.com> wrote:
>>>>
>>>>> What version of Hadoop you are using? (YARN or no YARN)
>>>>> To answer your question; Yes its possible and simple. All you need to
>>>>> to is to have Hadoop JARs on the classpath with relevant configuration
>>>>> files on the same  classpath pointing to the Hadoop cluster. Most often
>>>>> people simply copy core-site.xml, yarn-site.xml etc from the actual cluster
>>>>> to the application classpath and then you can run it straight from IDE.
>>>>>
>>>>> Not a windows user so not sure about that second part of the question.
>>>>>
>>>>> Cheers
>>>>> Oleg
>>>>>
>>>>>
>>>>> On Fri, Apr 25, 2014 at 11:46 AM, Steve Lewis <lo...@gmail.com>wrote:
>>>>>
>>>>>> Assume I have a machine on the same network as a hadoop 2 cluster but
>>>>>> separate from it.
>>>>>>
>>>>>> My understanding is that by setting certain elements of the config
>>>>>> file or local xml files to point to the cluster I can launch a job without
>>>>>> having to log into the cluster, move my jar to hdfs and start the job from
>>>>>> the cluster's hadoop machine.
>>>>>>
>>>>>> Does this work?
>>>>>> What Parameters need I sat?
>>>>>> Where is the jar file?
>>>>>> What issues would I see if the machine is running Windows with cygwin
>>>>>> installed?
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Steven M. Lewis PhD
>>>> 4221 105th Ave NE
>>>> Kirkland, WA 98033
>>>> 206-384-1340 (cell)
>>>> Skype lordjoe_com
>>>>
>>>>
>>>
>>
>>
>> --
>> Steven M. Lewis PhD
>> 4221 105th Ave NE
>> Kirkland, WA 98033
>> 206-384-1340 (cell)
>> Skype lordjoe_com
>>
>>
>


-- 
Steven M. Lewis PhD
4221 105th Ave NE
Kirkland, WA 98033
206-384-1340 (cell)
Skype lordjoe_com

Re: What configuration parameters cause a Hadoop 2.x job to run on the cluster

Posted by Oleg Zhurakousky <ol...@gmail.com>.

Are you talking about MR or plain YARN application?
In MR you typically use one of the job.setJar* methods. That aside you may
have more then your app JAR (dependencies). So you can copy the
dependencies to all hadoop nodes classpath (e.g., shared dir)

Oleg


On Fri, Apr 25, 2014 at 1:02 PM, Steve Lewis <lo...@gmail.com> wrote:

> so if I create a Hadoop jar file with referenced libraries in the lib
> directory do I need to move it to hdfs or can it sit on my local machine?
> if I move it to hdfs where does it live - which is to say how do I specify
> the path?
>
>
> On Fri, Apr 25, 2014 at 9:52 AM, Oleg Zhurakousky <
> oleg.zhurakousky@gmail.com> wrote:
>
>> Yes, if you are running MR
>>
>>
>> On Fri, Apr 25, 2014 at 12:48 PM, Steve Lewis <lo...@gmail.com>wrote:
>>
>>> Thank you for your answer
>>>
>>> 1) I am using YARN
>>> 2) So presumably dropping  core-site.xml, yarn-site into user.dir works
>>> do I need mapred-site.xml as well?
>>>
>>>
>>>
>>> On Fri, Apr 25, 2014 at 9:00 AM, Oleg Zhurakousky <
>>> oleg.zhurakousky@gmail.com> wrote:
>>>
>>>> What version of Hadoop you are using? (YARN or no YARN)
>>>> To answer your question; Yes its possible and simple. All you need to
>>>> to is to have Hadoop JARs on the classpath with relevant configuration
>>>> files on the same  classpath pointing to the Hadoop cluster. Most often
>>>> people simply copy core-site.xml, yarn-site.xml etc from the actual cluster
>>>> to the application classpath and then you can run it straight from IDE.
>>>>
>>>> Not a windows user so not sure about that second part of the question.
>>>>
>>>> Cheers
>>>> Oleg
>>>>
>>>>
>>>> On Fri, Apr 25, 2014 at 11:46 AM, Steve Lewis <lo...@gmail.com>wrote:
>>>>
>>>>> Assume I have a machine on the same network as a hadoop 2 cluster but
>>>>> separate from it.
>>>>>
>>>>> My understanding is that by setting certain elements of the config
>>>>> file or local xml files to point to the cluster I can launch a job without
>>>>> having to log into the cluster, move my jar to hdfs and start the job from
>>>>> the cluster's hadoop machine.
>>>>>
>>>>> Does this work?
>>>>> What Parameters need I sat?
>>>>> Where is the jar file?
>>>>> What issues would I see if the machine is running Windows with cygwin
>>>>> installed?
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Steven M. Lewis PhD
>>> 4221 105th Ave NE
>>> Kirkland, WA 98033
>>> 206-384-1340 (cell)
>>> Skype lordjoe_com
>>>
>>>
>>
>
>
> --
> Steven M. Lewis PhD
> 4221 105th Ave NE
> Kirkland, WA 98033
> 206-384-1340 (cell)
> Skype lordjoe_com
>
>

Re: What configuration parameters cause a Hadoop 2.x job to run on the cluster

Posted by Oleg Zhurakousky <ol...@gmail.com>.

Are you talking about MR or plain YARN application?
In MR you typically use one of the job.setJar* methods. That aside you may
have more then your app JAR (dependencies). So you can copy the
dependencies to all hadoop nodes classpath (e.g., shared dir)

Oleg


On Fri, Apr 25, 2014 at 1:02 PM, Steve Lewis <lo...@gmail.com> wrote:

> so if I create a Hadoop jar file with referenced libraries in the lib
> directory do I need to move it to hdfs or can it sit on my local machine?
> if I move it to hdfs where does it live - which is to say how do I specify
> the path?
>
>
> On Fri, Apr 25, 2014 at 9:52 AM, Oleg Zhurakousky <
> oleg.zhurakousky@gmail.com> wrote:
>
>> Yes, if you are running MR
>>
>>
>> On Fri, Apr 25, 2014 at 12:48 PM, Steve Lewis <lo...@gmail.com>wrote:
>>
>>> Thank you for your answer
>>>
>>> 1) I am using YARN
>>> 2) So presumably dropping  core-site.xml, yarn-site into user.dir works
>>> do I need mapred-site.xml as well?
>>>
>>>
>>>
>>> On Fri, Apr 25, 2014 at 9:00 AM, Oleg Zhurakousky <
>>> oleg.zhurakousky@gmail.com> wrote:
>>>
>>>> What version of Hadoop you are using? (YARN or no YARN)
>>>> To answer your question; Yes its possible and simple. All you need to
>>>> to is to have Hadoop JARs on the classpath with relevant configuration
>>>> files on the same  classpath pointing to the Hadoop cluster. Most often
>>>> people simply copy core-site.xml, yarn-site.xml etc from the actual cluster
>>>> to the application classpath and then you can run it straight from IDE.
>>>>
>>>> Not a windows user so not sure about that second part of the question.
>>>>
>>>> Cheers
>>>> Oleg
>>>>
>>>>
>>>> On Fri, Apr 25, 2014 at 11:46 AM, Steve Lewis <lo...@gmail.com>wrote:
>>>>
>>>>> Assume I have a machine on the same network as a hadoop 2 cluster but
>>>>> separate from it.
>>>>>
>>>>> My understanding is that by setting certain elements of the config
>>>>> file or local xml files to point to the cluster I can launch a job without
>>>>> having to log into the cluster, move my jar to hdfs and start the job from
>>>>> the cluster's hadoop machine.
>>>>>
>>>>> Does this work?
>>>>> What Parameters need I sat?
>>>>> Where is the jar file?
>>>>> What issues would I see if the machine is running Windows with cygwin
>>>>> installed?
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Steven M. Lewis PhD
>>> 4221 105th Ave NE
>>> Kirkland, WA 98033
>>> 206-384-1340 (cell)
>>> Skype lordjoe_com
>>>
>>>
>>
>
>
> --
> Steven M. Lewis PhD
> 4221 105th Ave NE
> Kirkland, WA 98033
> 206-384-1340 (cell)
> Skype lordjoe_com
>
>

Re: What configuration parameters cause a Hadoop 2.x job to run on the cluster

Posted by Oleg Zhurakousky <ol...@gmail.com>.

Are you talking about MR or plain YARN application?
In MR you typically use one of the job.setJar* methods. That aside you may
have more then your app JAR (dependencies). So you can copy the
dependencies to all hadoop nodes classpath (e.g., shared dir)

Oleg


On Fri, Apr 25, 2014 at 1:02 PM, Steve Lewis <lo...@gmail.com> wrote:

> so if I create a Hadoop jar file with referenced libraries in the lib
> directory do I need to move it to hdfs or can it sit on my local machine?
> if I move it to hdfs where does it live - which is to say how do I specify
> the path?
>
>
> On Fri, Apr 25, 2014 at 9:52 AM, Oleg Zhurakousky <
> oleg.zhurakousky@gmail.com> wrote:
>
>> Yes, if you are running MR
>>
>>
>> On Fri, Apr 25, 2014 at 12:48 PM, Steve Lewis <lo...@gmail.com>wrote:
>>
>>> Thank you for your answer
>>>
>>> 1) I am using YARN
>>> 2) So presumably dropping  core-site.xml, yarn-site into user.dir works
>>> do I need mapred-site.xml as well?
>>>
>>>
>>>
>>> On Fri, Apr 25, 2014 at 9:00 AM, Oleg Zhurakousky <
>>> oleg.zhurakousky@gmail.com> wrote:
>>>
>>>> What version of Hadoop you are using? (YARN or no YARN)
>>>> To answer your question; Yes its possible and simple. All you need to
>>>> to is to have Hadoop JARs on the classpath with relevant configuration
>>>> files on the same  classpath pointing to the Hadoop cluster. Most often
>>>> people simply copy core-site.xml, yarn-site.xml etc from the actual cluster
>>>> to the application classpath and then you can run it straight from IDE.
>>>>
>>>> Not a windows user so not sure about that second part of the question.
>>>>
>>>> Cheers
>>>> Oleg
>>>>
>>>>
>>>> On Fri, Apr 25, 2014 at 11:46 AM, Steve Lewis <lo...@gmail.com>wrote:
>>>>
>>>>> Assume I have a machine on the same network as a hadoop 2 cluster but
>>>>> separate from it.
>>>>>
>>>>> My understanding is that by setting certain elements of the config
>>>>> file or local xml files to point to the cluster I can launch a job without
>>>>> having to log into the cluster, move my jar to hdfs and start the job from
>>>>> the cluster's hadoop machine.
>>>>>
>>>>> Does this work?
>>>>> What Parameters need I sat?
>>>>> Where is the jar file?
>>>>> What issues would I see if the machine is running Windows with cygwin
>>>>> installed?
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Steven M. Lewis PhD
>>> 4221 105th Ave NE
>>> Kirkland, WA 98033
>>> 206-384-1340 (cell)
>>> Skype lordjoe_com
>>>
>>>
>>
>
>
> --
> Steven M. Lewis PhD
> 4221 105th Ave NE
> Kirkland, WA 98033
> 206-384-1340 (cell)
> Skype lordjoe_com
>
>

Re: What configuration parameters cause a Hadoop 2.x job to run on the cluster

Posted by Oleg Zhurakousky <ol...@gmail.com>.

Are you talking about MR or plain YARN application?
In MR you typically use one of the job.setJar* methods. That aside you may
have more then your app JAR (dependencies). So you can copy the
dependencies to all hadoop nodes classpath (e.g., shared dir)

Oleg


On Fri, Apr 25, 2014 at 1:02 PM, Steve Lewis <lo...@gmail.com> wrote:

> so if I create a Hadoop jar file with referenced libraries in the lib
> directory do I need to move it to hdfs or can it sit on my local machine?
> if I move it to hdfs where does it live - which is to say how do I specify
> the path?
>
>
> On Fri, Apr 25, 2014 at 9:52 AM, Oleg Zhurakousky <
> oleg.zhurakousky@gmail.com> wrote:
>
>> Yes, if you are running MR
>>
>>
>> On Fri, Apr 25, 2014 at 12:48 PM, Steve Lewis <lo...@gmail.com>wrote:
>>
>>> Thank you for your answer
>>>
>>> 1) I am using YARN
>>> 2) So presumably dropping  core-site.xml, yarn-site into user.dir works
>>> do I need mapred-site.xml as well?
>>>
>>>
>>>
>>> On Fri, Apr 25, 2014 at 9:00 AM, Oleg Zhurakousky <
>>> oleg.zhurakousky@gmail.com> wrote:
>>>
>>>> What version of Hadoop you are using? (YARN or no YARN)
>>>> To answer your question; Yes its possible and simple. All you need to
>>>> to is to have Hadoop JARs on the classpath with relevant configuration
>>>> files on the same  classpath pointing to the Hadoop cluster. Most often
>>>> people simply copy core-site.xml, yarn-site.xml etc from the actual cluster
>>>> to the application classpath and then you can run it straight from IDE.
>>>>
>>>> Not a windows user so not sure about that second part of the question.
>>>>
>>>> Cheers
>>>> Oleg
>>>>
>>>>
>>>> On Fri, Apr 25, 2014 at 11:46 AM, Steve Lewis <lo...@gmail.com>wrote:
>>>>
>>>>> Assume I have a machine on the same network as a hadoop 2 cluster but
>>>>> separate from it.
>>>>>
>>>>> My understanding is that by setting certain elements of the config
>>>>> file or local xml files to point to the cluster I can launch a job without
>>>>> having to log into the cluster, move my jar to hdfs and start the job from
>>>>> the cluster's hadoop machine.
>>>>>
>>>>> Does this work?
>>>>> What Parameters need I sat?
>>>>> Where is the jar file?
>>>>> What issues would I see if the machine is running Windows with cygwin
>>>>> installed?
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Steven M. Lewis PhD
>>> 4221 105th Ave NE
>>> Kirkland, WA 98033
>>> 206-384-1340 (cell)
>>> Skype lordjoe_com
>>>
>>>
>>
>
>
> --
> Steven M. Lewis PhD
> 4221 105th Ave NE
> Kirkland, WA 98033
> 206-384-1340 (cell)
> Skype lordjoe_com
>
>

Re: What configuration parameters cause a Hadoop 2.x job to run on the cluster

Posted by Steve Lewis <lo...@gmail.com>.

so if I create a Hadoop jar file with referenced libraries in the lib
directory do I need to move it to hdfs or can it sit on my local machine?
if I move it to hdfs where does it live - which is to say how do I specify
the path?


On Fri, Apr 25, 2014 at 9:52 AM, Oleg Zhurakousky <
oleg.zhurakousky@gmail.com> wrote:

> Yes, if you are running MR
>
>
> On Fri, Apr 25, 2014 at 12:48 PM, Steve Lewis <lo...@gmail.com>wrote:
>
>> Thank you for your answer
>>
>> 1) I am using YARN
>> 2) So presumably dropping  core-site.xml, yarn-site into user.dir works
>> do I need mapred-site.xml as well?
>>
>>
>>
>> On Fri, Apr 25, 2014 at 9:00 AM, Oleg Zhurakousky <
>> oleg.zhurakousky@gmail.com> wrote:
>>
>>> What version of Hadoop you are using? (YARN or no YARN)
>>> To answer your question; Yes its possible and simple. All you need to to
>>> is to have Hadoop JARs on the classpath with relevant configuration files
>>> on the same  classpath pointing to the Hadoop cluster. Most often people
>>> simply copy core-site.xml, yarn-site.xml etc from the actual cluster to the
>>> application classpath and then you can run it straight from IDE.
>>>
>>> Not a windows user so not sure about that second part of the question.
>>>
>>> Cheers
>>> Oleg
>>>
>>>
>>> On Fri, Apr 25, 2014 at 11:46 AM, Steve Lewis <lo...@gmail.com>wrote:
>>>
>>>> Assume I have a machine on the same network as a hadoop 2 cluster but
>>>> separate from it.
>>>>
>>>> My understanding is that by setting certain elements of the config file
>>>> or local xml files to point to the cluster I can launch a job without
>>>> having to log into the cluster, move my jar to hdfs and start the job from
>>>> the cluster's hadoop machine.
>>>>
>>>> Does this work?
>>>> What Parameters need I sat?
>>>> Where is the jar file?
>>>> What issues would I see if the machine is running Windows with cygwin
>>>> installed?
>>>>
>>>>
>>>
>>
>>
>> --
>> Steven M. Lewis PhD
>> 4221 105th Ave NE
>> Kirkland, WA 98033
>> 206-384-1340 (cell)
>> Skype lordjoe_com
>>
>>
>


-- 
Steven M. Lewis PhD
4221 105th Ave NE
Kirkland, WA 98033
206-384-1340 (cell)
Skype lordjoe_com

Re: What configuration parameters cause a Hadoop 2.x job to run on the cluster

Posted by Steve Lewis <lo...@gmail.com>.

so if I create a Hadoop jar file with referenced libraries in the lib
directory do I need to move it to hdfs or can it sit on my local machine?
if I move it to hdfs where does it live - which is to say how do I specify
the path?


On Fri, Apr 25, 2014 at 9:52 AM, Oleg Zhurakousky <
oleg.zhurakousky@gmail.com> wrote:

> Yes, if you are running MR
>
>
> On Fri, Apr 25, 2014 at 12:48 PM, Steve Lewis <lo...@gmail.com>wrote:
>
>> Thank you for your answer
>>
>> 1) I am using YARN
>> 2) So presumably dropping  core-site.xml, yarn-site into user.dir works
>> do I need mapred-site.xml as well?
>>
>>
>>
>> On Fri, Apr 25, 2014 at 9:00 AM, Oleg Zhurakousky <
>> oleg.zhurakousky@gmail.com> wrote:
>>
>>> What version of Hadoop you are using? (YARN or no YARN)
>>> To answer your question; Yes its possible and simple. All you need to to
>>> is to have Hadoop JARs on the classpath with relevant configuration files
>>> on the same  classpath pointing to the Hadoop cluster. Most often people
>>> simply copy core-site.xml, yarn-site.xml etc from the actual cluster to the
>>> application classpath and then you can run it straight from IDE.
>>>
>>> Not a windows user so not sure about that second part of the question.
>>>
>>> Cheers
>>> Oleg
>>>
>>>
>>> On Fri, Apr 25, 2014 at 11:46 AM, Steve Lewis <lo...@gmail.com>wrote:
>>>
>>>> Assume I have a machine on the same network as a hadoop 2 cluster but
>>>> separate from it.
>>>>
>>>> My understanding is that by setting certain elements of the config file
>>>> or local xml files to point to the cluster I can launch a job without
>>>> having to log into the cluster, move my jar to hdfs and start the job from
>>>> the cluster's hadoop machine.
>>>>
>>>> Does this work?
>>>> What Parameters need I sat?
>>>> Where is the jar file?
>>>> What issues would I see if the machine is running Windows with cygwin
>>>> installed?
>>>>
>>>>
>>>
>>
>>
>> --
>> Steven M. Lewis PhD
>> 4221 105th Ave NE
>> Kirkland, WA 98033
>> 206-384-1340 (cell)
>> Skype lordjoe_com
>>
>>
>


-- 
Steven M. Lewis PhD
4221 105th Ave NE
Kirkland, WA 98033
206-384-1340 (cell)
Skype lordjoe_com

Re: What configuration parameters cause a Hadoop 2.x job to run on the cluster

Posted by Steve Lewis <lo...@gmail.com>.

so if I create a Hadoop jar file with referenced libraries in the lib
directory do I need to move it to hdfs or can it sit on my local machine?
if I move it to hdfs where does it live - which is to say how do I specify
the path?


On Fri, Apr 25, 2014 at 9:52 AM, Oleg Zhurakousky <
oleg.zhurakousky@gmail.com> wrote:

> Yes, if you are running MR
>
>
> On Fri, Apr 25, 2014 at 12:48 PM, Steve Lewis <lo...@gmail.com>wrote:
>
>> Thank you for your answer
>>
>> 1) I am using YARN
>> 2) So presumably dropping  core-site.xml, yarn-site into user.dir works
>> do I need mapred-site.xml as well?
>>
>>
>>
>> On Fri, Apr 25, 2014 at 9:00 AM, Oleg Zhurakousky <
>> oleg.zhurakousky@gmail.com> wrote:
>>
>>> What version of Hadoop you are using? (YARN or no YARN)
>>> To answer your question; Yes its possible and simple. All you need to to
>>> is to have Hadoop JARs on the classpath with relevant configuration files
>>> on the same  classpath pointing to the Hadoop cluster. Most often people
>>> simply copy core-site.xml, yarn-site.xml etc from the actual cluster to the
>>> application classpath and then you can run it straight from IDE.
>>>
>>> Not a windows user so not sure about that second part of the question.
>>>
>>> Cheers
>>> Oleg
>>>
>>>
>>> On Fri, Apr 25, 2014 at 11:46 AM, Steve Lewis <lo...@gmail.com>wrote:
>>>
>>>> Assume I have a machine on the same network as a hadoop 2 cluster but
>>>> separate from it.
>>>>
>>>> My understanding is that by setting certain elements of the config file
>>>> or local xml files to point to the cluster I can launch a job without
>>>> having to log into the cluster, move my jar to hdfs and start the job from
>>>> the cluster's hadoop machine.
>>>>
>>>> Does this work?
>>>> What Parameters need I sat?
>>>> Where is the jar file?
>>>> What issues would I see if the machine is running Windows with cygwin
>>>> installed?
>>>>
>>>>
>>>
>>
>>
>> --
>> Steven M. Lewis PhD
>> 4221 105th Ave NE
>> Kirkland, WA 98033
>> 206-384-1340 (cell)
>> Skype lordjoe_com
>>
>>
>


-- 
Steven M. Lewis PhD
4221 105th Ave NE
Kirkland, WA 98033
206-384-1340 (cell)
Skype lordjoe_com

Re: What configuration parameters cause a Hadoop 2.x job to run on the cluster

Posted by Steve Lewis <lo...@gmail.com>.

so if I create a Hadoop jar file with referenced libraries in the lib
directory do I need to move it to hdfs or can it sit on my local machine?
if I move it to hdfs where does it live - which is to say how do I specify
the path?


On Fri, Apr 25, 2014 at 9:52 AM, Oleg Zhurakousky <
oleg.zhurakousky@gmail.com> wrote:

> Yes, if you are running MR
>
>
> On Fri, Apr 25, 2014 at 12:48 PM, Steve Lewis <lo...@gmail.com>wrote:
>
>> Thank you for your answer
>>
>> 1) I am using YARN
>> 2) So presumably dropping  core-site.xml, yarn-site into user.dir works
>> do I need mapred-site.xml as well?
>>
>>
>>
>> On Fri, Apr 25, 2014 at 9:00 AM, Oleg Zhurakousky <
>> oleg.zhurakousky@gmail.com> wrote:
>>
>>> What version of Hadoop you are using? (YARN or no YARN)
>>> To answer your question; Yes its possible and simple. All you need to to
>>> is to have Hadoop JARs on the classpath with relevant configuration files
>>> on the same  classpath pointing to the Hadoop cluster. Most often people
>>> simply copy core-site.xml, yarn-site.xml etc from the actual cluster to the
>>> application classpath and then you can run it straight from IDE.
>>>
>>> Not a windows user so not sure about that second part of the question.
>>>
>>> Cheers
>>> Oleg
>>>
>>>
>>> On Fri, Apr 25, 2014 at 11:46 AM, Steve Lewis <lo...@gmail.com>wrote:
>>>
>>>> Assume I have a machine on the same network as a hadoop 2 cluster but
>>>> separate from it.
>>>>
>>>> My understanding is that by setting certain elements of the config file
>>>> or local xml files to point to the cluster I can launch a job without
>>>> having to log into the cluster, move my jar to hdfs and start the job from
>>>> the cluster's hadoop machine.
>>>>
>>>> Does this work?
>>>> What Parameters need I sat?
>>>> Where is the jar file?
>>>> What issues would I see if the machine is running Windows with cygwin
>>>> installed?
>>>>
>>>>
>>>
>>
>>
>> --
>> Steven M. Lewis PhD
>> 4221 105th Ave NE
>> Kirkland, WA 98033
>> 206-384-1340 (cell)
>> Skype lordjoe_com
>>
>>
>


-- 
Steven M. Lewis PhD
4221 105th Ave NE
Kirkland, WA 98033
206-384-1340 (cell)
Skype lordjoe_com

Re: What configuration parameters cause a Hadoop 2.x job to run on the cluster

Posted by Oleg Zhurakousky <ol...@gmail.com>.

Yes, if you are running MR


On Fri, Apr 25, 2014 at 12:48 PM, Steve Lewis <lo...@gmail.com> wrote:

> Thank you for your answer
>
> 1) I am using YARN
> 2) So presumably dropping  core-site.xml, yarn-site into user.dir works
> do I need mapred-site.xml as well?
>
>
>
> On Fri, Apr 25, 2014 at 9:00 AM, Oleg Zhurakousky <
> oleg.zhurakousky@gmail.com> wrote:
>
>> What version of Hadoop you are using? (YARN or no YARN)
>> To answer your question; Yes its possible and simple. All you need to to
>> is to have Hadoop JARs on the classpath with relevant configuration files
>> on the same  classpath pointing to the Hadoop cluster. Most often people
>> simply copy core-site.xml, yarn-site.xml etc from the actual cluster to the
>> application classpath and then you can run it straight from IDE.
>>
>> Not a windows user so not sure about that second part of the question.
>>
>> Cheers
>> Oleg
>>
>>
>> On Fri, Apr 25, 2014 at 11:46 AM, Steve Lewis <lo...@gmail.com>wrote:
>>
>>> Assume I have a machine on the same network as a hadoop 2 cluster but
>>> separate from it.
>>>
>>> My understanding is that by setting certain elements of the config file
>>> or local xml files to point to the cluster I can launch a job without
>>> having to log into the cluster, move my jar to hdfs and start the job from
>>> the cluster's hadoop machine.
>>>
>>> Does this work?
>>> What Parameters need I sat?
>>> Where is the jar file?
>>> What issues would I see if the machine is running Windows with cygwin
>>> installed?
>>>
>>>
>>
>
>
> --
> Steven M. Lewis PhD
> 4221 105th Ave NE
> Kirkland, WA 98033
> 206-384-1340 (cell)
> Skype lordjoe_com
>
>

Re: What configuration parameters cause a Hadoop 2.x job to run on the cluster

Posted by Oleg Zhurakousky <ol...@gmail.com>.

Yes, if you are running MR


On Fri, Apr 25, 2014 at 12:48 PM, Steve Lewis <lo...@gmail.com> wrote:

> Thank you for your answer
>
> 1) I am using YARN
> 2) So presumably dropping  core-site.xml, yarn-site into user.dir works
> do I need mapred-site.xml as well?
>
>
>
> On Fri, Apr 25, 2014 at 9:00 AM, Oleg Zhurakousky <
> oleg.zhurakousky@gmail.com> wrote:
>
>> What version of Hadoop you are using? (YARN or no YARN)
>> To answer your question; Yes its possible and simple. All you need to to
>> is to have Hadoop JARs on the classpath with relevant configuration files
>> on the same  classpath pointing to the Hadoop cluster. Most often people
>> simply copy core-site.xml, yarn-site.xml etc from the actual cluster to the
>> application classpath and then you can run it straight from IDE.
>>
>> Not a windows user so not sure about that second part of the question.
>>
>> Cheers
>> Oleg
>>
>>
>> On Fri, Apr 25, 2014 at 11:46 AM, Steve Lewis <lo...@gmail.com>wrote:
>>
>>> Assume I have a machine on the same network as a hadoop 2 cluster but
>>> separate from it.
>>>
>>> My understanding is that by setting certain elements of the config file
>>> or local xml files to point to the cluster I can launch a job without
>>> having to log into the cluster, move my jar to hdfs and start the job from
>>> the cluster's hadoop machine.
>>>
>>> Does this work?
>>> What Parameters need I sat?
>>> Where is the jar file?
>>> What issues would I see if the machine is running Windows with cygwin
>>> installed?
>>>
>>>
>>
>
>
> --
> Steven M. Lewis PhD
> 4221 105th Ave NE
> Kirkland, WA 98033
> 206-384-1340 (cell)
> Skype lordjoe_com
>
>

Re: What configuration parameters cause a Hadoop 2.x job to run on the cluster

Posted by Oleg Zhurakousky <ol...@gmail.com>.

Yes, if you are running MR


On Fri, Apr 25, 2014 at 12:48 PM, Steve Lewis <lo...@gmail.com> wrote:

> Thank you for your answer
>
> 1) I am using YARN
> 2) So presumably dropping  core-site.xml, yarn-site into user.dir works
> do I need mapred-site.xml as well?
>
>
>
> On Fri, Apr 25, 2014 at 9:00 AM, Oleg Zhurakousky <
> oleg.zhurakousky@gmail.com> wrote:
>
>> What version of Hadoop you are using? (YARN or no YARN)
>> To answer your question; Yes its possible and simple. All you need to to
>> is to have Hadoop JARs on the classpath with relevant configuration files
>> on the same  classpath pointing to the Hadoop cluster. Most often people
>> simply copy core-site.xml, yarn-site.xml etc from the actual cluster to the
>> application classpath and then you can run it straight from IDE.
>>
>> Not a windows user so not sure about that second part of the question.
>>
>> Cheers
>> Oleg
>>
>>
>> On Fri, Apr 25, 2014 at 11:46 AM, Steve Lewis <lo...@gmail.com>wrote:
>>
>>> Assume I have a machine on the same network as a hadoop 2 cluster but
>>> separate from it.
>>>
>>> My understanding is that by setting certain elements of the config file
>>> or local xml files to point to the cluster I can launch a job without
>>> having to log into the cluster, move my jar to hdfs and start the job from
>>> the cluster's hadoop machine.
>>>
>>> Does this work?
>>> What Parameters need I sat?
>>> Where is the jar file?
>>> What issues would I see if the machine is running Windows with cygwin
>>> installed?
>>>
>>>
>>
>
>
> --
> Steven M. Lewis PhD
> 4221 105th Ave NE
> Kirkland, WA 98033
> 206-384-1340 (cell)
> Skype lordjoe_com
>
>

Re: What configuration parameters cause a Hadoop 2.x job to run on the cluster

Posted by Oleg Zhurakousky <ol...@gmail.com>.

Yes, if you are running MR


On Fri, Apr 25, 2014 at 12:48 PM, Steve Lewis <lo...@gmail.com> wrote:

> Thank you for your answer
>
> 1) I am using YARN
> 2) So presumably dropping  core-site.xml, yarn-site into user.dir works
> do I need mapred-site.xml as well?
>
>
>
> On Fri, Apr 25, 2014 at 9:00 AM, Oleg Zhurakousky <
> oleg.zhurakousky@gmail.com> wrote:
>
>> What version of Hadoop you are using? (YARN or no YARN)
>> To answer your question; Yes its possible and simple. All you need to to
>> is to have Hadoop JARs on the classpath with relevant configuration files
>> on the same  classpath pointing to the Hadoop cluster. Most often people
>> simply copy core-site.xml, yarn-site.xml etc from the actual cluster to the
>> application classpath and then you can run it straight from IDE.
>>
>> Not a windows user so not sure about that second part of the question.
>>
>> Cheers
>> Oleg
>>
>>
>> On Fri, Apr 25, 2014 at 11:46 AM, Steve Lewis <lo...@gmail.com>wrote:
>>
>>> Assume I have a machine on the same network as a hadoop 2 cluster but
>>> separate from it.
>>>
>>> My understanding is that by setting certain elements of the config file
>>> or local xml files to point to the cluster I can launch a job without
>>> having to log into the cluster, move my jar to hdfs and start the job from
>>> the cluster's hadoop machine.
>>>
>>> Does this work?
>>> What Parameters need I sat?
>>> Where is the jar file?
>>> What issues would I see if the machine is running Windows with cygwin
>>> installed?
>>>
>>>
>>
>
>
> --
> Steven M. Lewis PhD
> 4221 105th Ave NE
> Kirkland, WA 98033
> 206-384-1340 (cell)
> Skype lordjoe_com
>
>

Re: What configuration parameters cause a Hadoop 2.x job to run on the cluster

Posted by Steve Lewis <lo...@gmail.com>.

Thank you for your answer

1) I am using YARN
2) So presumably dropping  core-site.xml, yarn-site into user.dir works do
I need mapred-site.xml as well?



On Fri, Apr 25, 2014 at 9:00 AM, Oleg Zhurakousky <
oleg.zhurakousky@gmail.com> wrote:

> What version of Hadoop you are using? (YARN or no YARN)
> To answer your question; Yes its possible and simple. All you need to to
> is to have Hadoop JARs on the classpath with relevant configuration files
> on the same  classpath pointing to the Hadoop cluster. Most often people
> simply copy core-site.xml, yarn-site.xml etc from the actual cluster to the
> application classpath and then you can run it straight from IDE.
>
> Not a windows user so not sure about that second part of the question.
>
> Cheers
> Oleg
>
>
> On Fri, Apr 25, 2014 at 11:46 AM, Steve Lewis <lo...@gmail.com>wrote:
>
>> Assume I have a machine on the same network as a hadoop 2 cluster but
>> separate from it.
>>
>> My understanding is that by setting certain elements of the config file
>> or local xml files to point to the cluster I can launch a job without
>> having to log into the cluster, move my jar to hdfs and start the job from
>> the cluster's hadoop machine.
>>
>> Does this work?
>> What Parameters need I sat?
>> Where is the jar file?
>> What issues would I see if the machine is running Windows with cygwin
>> installed?
>>
>>
>


-- 
Steven M. Lewis PhD
4221 105th Ave NE
Kirkland, WA 98033
206-384-1340 (cell)
Skype lordjoe_com

Re: What configuration parameters cause a Hadoop 2.x job to run on the cluster

Posted by Steve Lewis <lo...@gmail.com>.

Thank you for your answer

1) I am using YARN
2) So presumably dropping  core-site.xml, yarn-site into user.dir works do
I need mapred-site.xml as well?



On Fri, Apr 25, 2014 at 9:00 AM, Oleg Zhurakousky <
oleg.zhurakousky@gmail.com> wrote:

> What version of Hadoop you are using? (YARN or no YARN)
> To answer your question; Yes its possible and simple. All you need to to
> is to have Hadoop JARs on the classpath with relevant configuration files
> on the same  classpath pointing to the Hadoop cluster. Most often people
> simply copy core-site.xml, yarn-site.xml etc from the actual cluster to the
> application classpath and then you can run it straight from IDE.
>
> Not a windows user so not sure about that second part of the question.
>
> Cheers
> Oleg
>
>
> On Fri, Apr 25, 2014 at 11:46 AM, Steve Lewis <lo...@gmail.com>wrote:
>
>> Assume I have a machine on the same network as a hadoop 2 cluster but
>> separate from it.
>>
>> My understanding is that by setting certain elements of the config file
>> or local xml files to point to the cluster I can launch a job without
>> having to log into the cluster, move my jar to hdfs and start the job from
>> the cluster's hadoop machine.
>>
>> Does this work?
>> What Parameters need I sat?
>> Where is the jar file?
>> What issues would I see if the machine is running Windows with cygwin
>> installed?
>>
>>
>


-- 
Steven M. Lewis PhD
4221 105th Ave NE
Kirkland, WA 98033
206-384-1340 (cell)
Skype lordjoe_com

Re: What configuration parameters cause a Hadoop 2.x job to run on the cluster

Posted by Steve Lewis <lo...@gmail.com>.

Thank you for your answer

1) I am using YARN
2) So presumably dropping  core-site.xml, yarn-site into user.dir works do
I need mapred-site.xml as well?



On Fri, Apr 25, 2014 at 9:00 AM, Oleg Zhurakousky <
oleg.zhurakousky@gmail.com> wrote:

> What version of Hadoop you are using? (YARN or no YARN)
> To answer your question; Yes its possible and simple. All you need to to
> is to have Hadoop JARs on the classpath with relevant configuration files
> on the same  classpath pointing to the Hadoop cluster. Most often people
> simply copy core-site.xml, yarn-site.xml etc from the actual cluster to the
> application classpath and then you can run it straight from IDE.
>
> Not a windows user so not sure about that second part of the question.
>
> Cheers
> Oleg
>
>
> On Fri, Apr 25, 2014 at 11:46 AM, Steve Lewis <lo...@gmail.com>wrote:
>
>> Assume I have a machine on the same network as a hadoop 2 cluster but
>> separate from it.
>>
>> My understanding is that by setting certain elements of the config file
>> or local xml files to point to the cluster I can launch a job without
>> having to log into the cluster, move my jar to hdfs and start the job from
>> the cluster's hadoop machine.
>>
>> Does this work?
>> What Parameters need I sat?
>> Where is the jar file?
>> What issues would I see if the machine is running Windows with cygwin
>> installed?
>>
>>
>


-- 
Steven M. Lewis PhD
4221 105th Ave NE
Kirkland, WA 98033
206-384-1340 (cell)
Skype lordjoe_com

Re: What configuration parameters cause a Hadoop 2.x job to run on the cluster

Posted by Steve Lewis <lo...@gmail.com>.

Thank you for your answer

1) I am using YARN
2) So presumably dropping  core-site.xml, yarn-site into user.dir works do
I need mapred-site.xml as well?



On Fri, Apr 25, 2014 at 9:00 AM, Oleg Zhurakousky <
oleg.zhurakousky@gmail.com> wrote:

> What version of Hadoop you are using? (YARN or no YARN)
> To answer your question; Yes its possible and simple. All you need to to
> is to have Hadoop JARs on the classpath with relevant configuration files
> on the same  classpath pointing to the Hadoop cluster. Most often people
> simply copy core-site.xml, yarn-site.xml etc from the actual cluster to the
> application classpath and then you can run it straight from IDE.
>
> Not a windows user so not sure about that second part of the question.
>
> Cheers
> Oleg
>
>
> On Fri, Apr 25, 2014 at 11:46 AM, Steve Lewis <lo...@gmail.com>wrote:
>
>> Assume I have a machine on the same network as a hadoop 2 cluster but
>> separate from it.
>>
>> My understanding is that by setting certain elements of the config file
>> or local xml files to point to the cluster I can launch a job without
>> having to log into the cluster, move my jar to hdfs and start the job from
>> the cluster's hadoop machine.
>>
>> Does this work?
>> What Parameters need I sat?
>> Where is the jar file?
>> What issues would I see if the machine is running Windows with cygwin
>> installed?
>>
>>
>


-- 
Steven M. Lewis PhD
4221 105th Ave NE
Kirkland, WA 98033
206-384-1340 (cell)
Skype lordjoe_com

Re: What configuration parameters cause a Hadoop 2.x job to run on the cluster

Posted by Oleg Zhurakousky <ol...@gmail.com>.

What version of Hadoop you are using? (YARN or no YARN)
To answer your question; Yes its possible and simple. All you need to to is
to have Hadoop JARs on the classpath with relevant configuration files on
the same  classpath pointing to the Hadoop cluster. Most often people
simply copy core-site.xml, yarn-site.xml etc from the actual cluster to the
application classpath and then you can run it straight from IDE.

Not a windows user so not sure about that second part of the question.

Cheers
Oleg

On Fri, Apr 25, 2014 at 11:46 AM, Steve Lewis <lo...@gmail.com> wrote:

> Assume I have a machine on the same network as a hadoop 2 cluster but
> separate from it.
>
> My understanding is that by setting certain elements of the config file or
> local xml files to point to the cluster I can launch a job without having
> to log into the cluster, move my jar to hdfs and start the job from the
> cluster's hadoop machine.
>
> Does this work?
> What Parameters need I sat?
> Where is the jar file?
> What issues would I see if the machine is running Windows with cygwin
> installed?
>
>

Re: What configuration parameters cause a Hadoop 2.x job to run on the cluster

Posted by Oleg Zhurakousky <ol...@gmail.com>.

What version of Hadoop you are using? (YARN or no YARN)
To answer your question; Yes its possible and simple. All you need to to is
to have Hadoop JARs on the classpath with relevant configuration files on
the same  classpath pointing to the Hadoop cluster. Most often people
simply copy core-site.xml, yarn-site.xml etc from the actual cluster to the
application classpath and then you can run it straight from IDE.

Not a windows user so not sure about that second part of the question.

Cheers
Oleg

On Fri, Apr 25, 2014 at 11:46 AM, Steve Lewis <lo...@gmail.com> wrote:

> Assume I have a machine on the same network as a hadoop 2 cluster but
> separate from it.
>
> My understanding is that by setting certain elements of the config file or
> local xml files to point to the cluster I can launch a job without having
> to log into the cluster, move my jar to hdfs and start the job from the
> cluster's hadoop machine.
>
> Does this work?
> What Parameters need I sat?
> Where is the jar file?
> What issues would I see if the machine is running Windows with cygwin
> installed?
>
>

Re: What configuration parameters cause a Hadoop 2.x job to run on the cluster

Posted by Oleg Zhurakousky <ol...@gmail.com>.

What version of Hadoop you are using? (YARN or no YARN)
To answer your question; Yes its possible and simple. All you need to to is
to have Hadoop JARs on the classpath with relevant configuration files on
the same  classpath pointing to the Hadoop cluster. Most often people
simply copy core-site.xml, yarn-site.xml etc from the actual cluster to the
application classpath and then you can run it straight from IDE.

Not a windows user so not sure about that second part of the question.

Cheers
Oleg

On Fri, Apr 25, 2014 at 11:46 AM, Steve Lewis <lo...@gmail.com> wrote:

> Assume I have a machine on the same network as a hadoop 2 cluster but
> separate from it.
>
> My understanding is that by setting certain elements of the config file or
> local xml files to point to the cluster I can launch a job without having
> to log into the cluster, move my jar to hdfs and start the job from the
> cluster's hadoop machine.
>
> Does this work?
> What Parameters need I sat?
> Where is the jar file?
> What issues would I see if the machine is running Windows with cygwin
> installed?
>
>

Re: What configuration parameters cause a Hadoop 2.x job to run on the cluster

Posted by Oleg Zhurakousky <ol...@gmail.com>.

What version of Hadoop you are using? (YARN or no YARN)
To answer your question; Yes its possible and simple. All you need to to is
to have Hadoop JARs on the classpath with relevant configuration files on
the same  classpath pointing to the Hadoop cluster. Most often people
simply copy core-site.xml, yarn-site.xml etc from the actual cluster to the
application classpath and then you can run it straight from IDE.

Not a windows user so not sure about that second part of the question.

Cheers
Oleg

On Fri, Apr 25, 2014 at 11:46 AM, Steve Lewis <lo...@gmail.com> wrote:

> Assume I have a machine on the same network as a hadoop 2 cluster but
> separate from it.
>
> My understanding is that by setting certain elements of the config file or
> local xml files to point to the cluster I can launch a job without having
> to log into the cluster, move my jar to hdfs and start the job from the
> cluster's hadoop machine.
>
> Does this work?
> What Parameters need I sat?
> Where is the jar file?
> What issues would I see if the machine is running Windows with cygwin
> installed?
>
>