You are viewing a plain text version of this content. The canonical link for it is here.

Posted to hdfs-user@hadoop.apache.org by Pavan Sudheendra <pa...@gmail.com> on 2013/08/13 13:19:59 UTC

Maven Cloudera Configuration problem

Hi,
I'm currently using maven to build the jars necessary for my
map-reduce program to run and it works for a single node cluster..

For a multi node cluster, how do i specify my map-reduce program to
ingest the cluster settings instead of localhost settings?
I don't know how to specify this using maven to build my jar.

I'm using the cdh distribution by the way..
-- 
Regards-
Pavan

Re: Maven Cloudera Configuration problem

Posted by Shahab Yunus <sh...@gmail.com>.

You need to configure your namenode and jobtracker information in the
configuration files within you application. Only set the relevant
properties in the copy of the files that you are bundling in your job. For
rest the default values would be used from the default configuration files
(core-default.xml, mapred-default.xml) already bundled in the lib/jar
provided by cloudera/hadoop. The assumption is that this is for MRv1.

Anyway, you should go through this for details
http://hadoop.apache.org/docs/stable/cluster_setup.html

*core-site.xml *(teh security ones are optional and if you are not using
anything special you can remove them and rely on the defaults which is also
'simple'.

<configuration>

  <property>

    <name>fs.defaultFS</name>

    <value>hdfs://server:8020</value>

  </property>

  <property>

    <name>hadoop.security.authentication</name>

    <value>simple</value>

  </property>

  <property>

    <name>hadoop.security.auth_to_local</name>

    <value>DEFAULT</value>

  </property>

</configuration>

*map-red.xml*

</configuration>

  <property>

    <name>mapred.job.tracker</name>

    <value>http://server:</value>

  </property>

</configuration>*
*

Regards,
Shahab

*
*

On Tue, Aug 13, 2013 at 7:19 AM, Pavan Sudheendra <pa...@gmail.com>wrote:

> Hi,
> I'm currently using maven to build the jars necessary for my
> map-reduce program to run and it works for a single node cluster..
>
> For a multi node cluster, how do i specify my map-reduce program to
> ingest the cluster settings instead of localhost settings?
> I don't know how to specify this using maven to build my jar.
>
> I'm using the cdh distribution by the way..
> --
> Regards-
> Pavan
>

Re: Maven Cloudera Configuration problem

Posted by Brad Cox <br...@gmail.com>.

I've been stuck on the same question lately so don't take this as definitive, just my best guess at what's required.

Using maven as your hadoop source is going to give you a "vanilla" hadoop; one that runs on localhost. You need one that you've customized to point to your remote cluster and you can't get that via maven. 

So my *GUESS* is you need to do a plain local install of hadoop and point HADOOP_HOME at that. Customize as required, then convince eclipse to use that instead of going thru maven (i.e. remove hadoop from the dependency list).

Everyone; is this on the right path? Anyone know of exact instructions?

On Aug 13, 2013, at 12:07 PM, Pavan Sudheendra <pa...@gmail.com> wrote:

> When i actually run the job on the multi node cluster, logs shows it
> uses localhost configurations which i don't want..
> 
> I just have a pom.xml which lists all the dependencies like standard
> hadoop, standard hbase, standard zookeeper etc., Should i remove these
> dependencies?
> 
> I want the cluster settings to apply in my map-reduce application..
> So, this is where i'm stuck at..
> 
> On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra <pa...@gmail.com> wrote:
>> Hi Shabab and Sandy,
>> The thing is we have a 6 node cloudera cluster running.. For
>> development purposes, i was building a map-reduce application on a
>> single node apache distribution hadoop with maven..
>> 
>> To be frank, i don't know how to deploy this application on a multi
>> node cloudera cluster. I am fairly well versed with Multi Node Apache
>> Hadoop Distribution.. So, how can i go forward?
>> 
>> Thanks for all the help :)
>> 
>> On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
>>> Hi Pavan,
>>> 
>>> Configuration properties generally aren't included in the jar itself unless you explicitly set them in your java code. Rather they're picked up from the mapred-site.xml file located in the Hadoop configuration directory on the host you're running your job from.
>>> 
>>> Is there an issue you're coming up against when trying to run your job on a cluster?
>>> 
>>> -Sandy
>>> 
>>> (iphnoe tpying)
>>> 
>>> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com> wrote:
>>> 
>>>> Hi,
>>>> I'm currently using maven to build the jars necessary for my
>>>> map-reduce program to run and it works for a single node cluster..
>>>> 
>>>> For a multi node cluster, how do i specify my map-reduce program to
>>>> ingest the cluster settings instead of localhost settings?
>>>> I don't know how to specify this using maven to build my jar.
>>>> 
>>>> I'm using the cdh distribution by the way..
>>>> --
>>>> Regards-
>>>> Pavan
>> 
>> 
>> 
>> --
>> Regards-
>> Pavan
> 
> 
> 
> -- 
> Regards-
> Pavan

Dr. Brad J. Cox    Cell: 703-594-1883 Blog: http://bradjcox.blogspot.com http://virtualschool.edu

Re: Maven Cloudera Configuration problem

Posted by Brad Cox <br...@gmail.com>.

I've been stuck on the same question lately so don't take this as definitive, just my best guess at what's required.

Using maven as your hadoop source is going to give you a "vanilla" hadoop; one that runs on localhost. You need one that you've customized to point to your remote cluster and you can't get that via maven. 

So my *GUESS* is you need to do a plain local install of hadoop and point HADOOP_HOME at that. Customize as required, then convince eclipse to use that instead of going thru maven (i.e. remove hadoop from the dependency list).

Everyone; is this on the right path? Anyone know of exact instructions?

On Aug 13, 2013, at 12:07 PM, Pavan Sudheendra <pa...@gmail.com> wrote:

> When i actually run the job on the multi node cluster, logs shows it
> uses localhost configurations which i don't want..
> 
> I just have a pom.xml which lists all the dependencies like standard
> hadoop, standard hbase, standard zookeeper etc., Should i remove these
> dependencies?
> 
> I want the cluster settings to apply in my map-reduce application..
> So, this is where i'm stuck at..
> 
> On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra <pa...@gmail.com> wrote:
>> Hi Shabab and Sandy,
>> The thing is we have a 6 node cloudera cluster running.. For
>> development purposes, i was building a map-reduce application on a
>> single node apache distribution hadoop with maven..
>> 
>> To be frank, i don't know how to deploy this application on a multi
>> node cloudera cluster. I am fairly well versed with Multi Node Apache
>> Hadoop Distribution.. So, how can i go forward?
>> 
>> Thanks for all the help :)
>> 
>> On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
>>> Hi Pavan,
>>> 
>>> Configuration properties generally aren't included in the jar itself unless you explicitly set them in your java code. Rather they're picked up from the mapred-site.xml file located in the Hadoop configuration directory on the host you're running your job from.
>>> 
>>> Is there an issue you're coming up against when trying to run your job on a cluster?
>>> 
>>> -Sandy
>>> 
>>> (iphnoe tpying)
>>> 
>>> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com> wrote:
>>> 
>>>> Hi,
>>>> I'm currently using maven to build the jars necessary for my
>>>> map-reduce program to run and it works for a single node cluster..
>>>> 
>>>> For a multi node cluster, how do i specify my map-reduce program to
>>>> ingest the cluster settings instead of localhost settings?
>>>> I don't know how to specify this using maven to build my jar.
>>>> 
>>>> I'm using the cdh distribution by the way..
>>>> --
>>>> Regards-
>>>> Pavan
>> 
>> 
>> 
>> --
>> Regards-
>> Pavan
> 
> 
> 
> -- 
> Regards-
> Pavan

Dr. Brad J. Cox    Cell: 703-594-1883 Blog: http://bradjcox.blogspot.com http://virtualschool.edu

Re: Maven Cloudera Configuration problem

Posted by Shahab Yunus <sh...@gmail.com>.

In our Clouder 4.2.0 cluster, Iog-in with *admin* user (do you have
appropriate permissions by the way?) Then I click on any one of the 3
services (hbase, mapred, hdfs and excluding zookeeper) from the top-leftish
menu. Then for each of these I can click the *Configuration* tab which is
in the top-middlish section of the page. Once the configuration page opens
then I click on Action menu on the top-right. One of the sub-menu of this
is *Download Client Configuration* which as the name says downloads the
config files (zip file to be exact) to be used at client machines.

Regards,
Shahab


On Tue, Aug 13, 2013 at 6:07 PM, Brad Cox <br...@gmail.com> wrote:

> That link got my hopes up. But Cloudera Manager  (what I'm running; on
> CDH4) does not offer an "Export Client Config" option. What am I missing?
>
> On Aug 13, 2013, at 4:04 PM, Shahab Yunus <sh...@gmail.com> wrote:
>
> You should not use LocalJobRunner. Make sure that the mapred.job.tracker
> property does not point to 'local' an instead to your job-tracker host and
> port.
>
> *But before that* as Sandy said, your client machine (from where you will
> be kicking of your jobs and apps) should be using config files which will
> have your cluster's configuration. This is the alternative that you should
> follow if you don't want to bundle the configs for your cluster in the
> application itself (either in java code or separate copies of relevant
> properties set of config files.) This was something which I was suggesting
> early on to just to get you started using your cluster instead of local
> mode.
>
> By the way have you seen the following link? It gives you step by step
> information about how to generate config files from your cluster specific
> to your cluster and then how to place them and use the from any machine you
> want to designate as your client. Running your jobs form one of the
> datanodes without proper config would not work.
>
> https://ccp.cloudera.com/display/FREE373/Generating+Client+Configuration
>
> Regards,
> Shahab
>
>
> On Tue, Aug 13, 2013 at 1:07 PM, Pavan Sudheendra <pavan0591@gmail.com
> >wrote:
>
> Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the
> job on one datanode..
>
> What changes should i make so that my application would take advantage
> of the cluster as a whole?
>
> On Tue, Aug 13, 2013 at 10:33 PM,  <sa...@cloudera.com> wrote:
>
> Nothing in your pom.xml should affect the configurations your job runs
>
> with.
>
>
> Are you running your job from a node on the cluster? When you say
>
> localhost configurations, do you mean it's using the LocalJobRunner?
>
>
> -sandy
>
> (iphnoe tpying)
>
> On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra <pa...@gmail.com>
>
> wrote:
>
>
> When i actually run the job on the multi node cluster, logs shows it
> uses localhost configurations which i don't want..
>
> I just have a pom.xml which lists all the dependencies like standard
> hadoop, standard hbase, standard zookeeper etc., Should i remove these
> dependencies?
>
> I want the cluster settings to apply in my map-reduce application..
> So, this is where i'm stuck at..
>
> On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra <pa...@gmail.com>
>
> wrote:
>
> Hi Shabab and Sandy,
> The thing is we have a 6 node cloudera cluster running.. For
> development purposes, i was building a map-reduce application on a
> single node apache distribution hadoop with maven..
>
> To be frank, i don't know how to deploy this application on a multi
> node cloudera cluster. I am fairly well versed with Multi Node Apache
> Hadoop Distribution.. So, how can i go forward?
>
> Thanks for all the help :)
>
> On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
>
> Hi Pavan,
>
> Configuration properties generally aren't included in the jar itself
>
> unless you explicitly set them in your java code. Rather they're picked up
> from the mapred-site.xml file located in the Hadoop configuration directory
> on the host you're running your job from.
>
>
> Is there an issue you're coming up against when trying to run your
>
> job on a cluster?
>
>
> -Sandy
>
> (iphnoe tpying)
>
> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com>
>
> wrote:
>
>
> Hi,
> I'm currently using maven to build the jars necessary for my
> map-reduce program to run and it works for a single node cluster..
>
> For a multi node cluster, how do i specify my map-reduce program to
> ingest the cluster settings instead of localhost settings?
> I don't know how to specify this using maven to build my jar.
>
> I'm using the cdh distribution by the way..
> --
> Regards-
> Pavan
>
>
>
>
> --
> Regards-
> Pavan
>
>
>
>
> --
> Regards-
> Pavan
>
>
>
>
> --
> Regards-
> Pavan
>
>
> Dr. Brad J. Cox    Cell: 703-594-1883 Blog: http://bradjcox.blogspot.com
> http://virtualschool.edu
>
>
>
>
>

Re: Maven Cloudera Configuration problem

Posted by Pavan Sudheendra <pa...@gmail.com>.

Here are the log details when i run the jar file:


08:10:29,738  INFO ZooKeeper:438 - Initiating client connection,
connectString=localhost:2181 sessionTimeout=180000 watcher=hconnection
08:10:29,777  INFO RecoverableZooKeeper:104 - The identifier of this
process is 12909@xx-xxxx-xxx-xx.eu-west-
1.compute.internal
08:10:29,784  INFO ClientCnxn:966 - Opening socket connection to
server localhost/127.0.0.1:2181. Will not attempt to authenticate
using SASL (Unable to locate a login configuration)
08:10:29,796  INFO ClientCnxn:849 - Socket connection established to
localhost/127.0.0.1:2181, initiating session
08:10:29,804  INFO ClientCnxn:1207 - Session establishment complete on
server localhost/127.0.0.1:2181, sessionid = 0x13ff1cff71b5503,
negotiated timeout = 60000
08:10:29,905  WARN Configuration:824 - hadoop.native.lib is
deprecated. Instead, use io.native.lib.available

Is it utilizing the cluster? Sorry for a noob question.

On Wed, Aug 14, 2013 at 5:24 AM, Suresh Srinivas <su...@hortonworks.com> wrote:
> Folks, can you please take this thread to CDH related mailing list?
>
>
> On Tue, Aug 13, 2013 at 3:07 PM, Brad Cox <br...@gmail.com> wrote:
>>
>> That link got my hopes up. But Cloudera Manager  (what I'm running; on
>> CDH4) does not offer an "Export Client Config" option. What am I missing?
>>
>> On Aug 13, 2013, at 4:04 PM, Shahab Yunus <sh...@gmail.com> wrote:
>>
>> You should not use LocalJobRunner. Make sure that the mapred.job.tracker
>> property does not point to 'local' an instead to your job-tracker host and
>> port.
>>
>> *But before that* as Sandy said, your client machine (from where you will
>> be kicking of your jobs and apps) should be using config files which will
>> have your cluster's configuration. This is the alternative that you should
>> follow if you don't want to bundle the configs for your cluster in the
>> application itself (either in java code or separate copies of relevant
>> properties set of config files.) This was something which I was suggesting
>> early on to just to get you started using your cluster instead of local
>> mode.
>>
>> By the way have you seen the following link? It gives you step by step
>> information about how to generate config files from your cluster specific
>> to your cluster and then how to place them and use the from any machine
>> you
>> want to designate as your client. Running your jobs form one of the
>> datanodes without proper config would not work.
>>
>> https://ccp.cloudera.com/display/FREE373/Generating+Client+Configuration
>>
>> Regards,
>> Shahab
>>
>>
>> On Tue, Aug 13, 2013 at 1:07 PM, Pavan Sudheendra
>> <pa...@gmail.com>wrote:
>>
>> Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the
>> job on one datanode..
>>
>> What changes should i make so that my application would take advantage
>> of the cluster as a whole?
>>
>> On Tue, Aug 13, 2013 at 10:33 PM,  <sa...@cloudera.com> wrote:
>>
>> Nothing in your pom.xml should affect the configurations your job runs
>>
>> with.
>>
>>
>> Are you running your job from a node on the cluster? When you say
>>
>> localhost configurations, do you mean it's using the LocalJobRunner?
>>
>>
>> -sandy
>>
>> (iphnoe tpying)
>>
>> On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra <pa...@gmail.com>
>>
>> wrote:
>>
>>
>> When i actually run the job on the multi node cluster, logs shows it
>> uses localhost configurations which i don't want..
>>
>> I just have a pom.xml which lists all the dependencies like standard
>> hadoop, standard hbase, standard zookeeper etc., Should i remove these
>> dependencies?
>>
>> I want the cluster settings to apply in my map-reduce application..
>> So, this is where i'm stuck at..
>>
>> On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra <pa...@gmail.com>
>>
>> wrote:
>>
>> Hi Shabab and Sandy,
>> The thing is we have a 6 node cloudera cluster running.. For
>> development purposes, i was building a map-reduce application on a
>> single node apache distribution hadoop with maven..
>>
>> To be frank, i don't know how to deploy this application on a multi
>> node cloudera cluster. I am fairly well versed with Multi Node Apache
>> Hadoop Distribution.. So, how can i go forward?
>>
>> Thanks for all the help :)
>>
>> On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
>>
>> Hi Pavan,
>>
>> Configuration properties generally aren't included in the jar itself
>>
>> unless you explicitly set them in your java code. Rather they're picked up
>> from the mapred-site.xml file located in the Hadoop configuration
>> directory
>> on the host you're running your job from.
>>
>>
>> Is there an issue you're coming up against when trying to run your
>>
>> job on a cluster?
>>
>>
>> -Sandy
>>
>> (iphnoe tpying)
>>
>> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com>
>>
>> wrote:
>>
>>
>> Hi,
>> I'm currently using maven to build the jars necessary for my
>> map-reduce program to run and it works for a single node cluster..
>>
>> For a multi node cluster, how do i specify my map-reduce program to
>> ingest the cluster settings instead of localhost settings?
>> I don't know how to specify this using maven to build my jar.
>>
>> I'm using the cdh distribution by the way..
>> --
>> Regards-
>> Pavan
>>
>>
>>
>>
>> --
>> Regards-
>> Pavan
>>
>>
>>
>>
>> --
>> Regards-
>> Pavan
>>
>>
>>
>>
>> --
>> Regards-
>> Pavan
>>
>>
>> Dr. Brad J. Cox    Cell: 703-594-1883 Blog: http://bradjcox.blogspot.com
>> http://virtualschool.edu
>>
>>
>>
>>
>
>
>
> --
> http://hortonworks.com/download/
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader of
> this message is not the intended recipient, you are hereby notified that any
> printing, copying, dissemination, distribution, disclosure or forwarding of
> this communication is strictly prohibited. If you have received this
> communication in error, please contact the sender immediately and delete it
> from your system. Thank You.



-- 
Regards-
Pavan

Re: Maven Cloudera Configuration problem

Posted by Pavan Sudheendra <pa...@gmail.com>.

Here are the log details when i run the jar file:


08:10:29,738  INFO ZooKeeper:438 - Initiating client connection,
connectString=localhost:2181 sessionTimeout=180000 watcher=hconnection
08:10:29,777  INFO RecoverableZooKeeper:104 - The identifier of this
process is 12909@xx-xxxx-xxx-xx.eu-west-
1.compute.internal
08:10:29,784  INFO ClientCnxn:966 - Opening socket connection to
server localhost/127.0.0.1:2181. Will not attempt to authenticate
using SASL (Unable to locate a login configuration)
08:10:29,796  INFO ClientCnxn:849 - Socket connection established to
localhost/127.0.0.1:2181, initiating session
08:10:29,804  INFO ClientCnxn:1207 - Session establishment complete on
server localhost/127.0.0.1:2181, sessionid = 0x13ff1cff71b5503,
negotiated timeout = 60000
08:10:29,905  WARN Configuration:824 - hadoop.native.lib is
deprecated. Instead, use io.native.lib.available

Is it utilizing the cluster? Sorry for a noob question.

On Wed, Aug 14, 2013 at 5:24 AM, Suresh Srinivas <su...@hortonworks.com> wrote:
> Folks, can you please take this thread to CDH related mailing list?
>
>
> On Tue, Aug 13, 2013 at 3:07 PM, Brad Cox <br...@gmail.com> wrote:
>>
>> That link got my hopes up. But Cloudera Manager  (what I'm running; on
>> CDH4) does not offer an "Export Client Config" option. What am I missing?
>>
>> On Aug 13, 2013, at 4:04 PM, Shahab Yunus <sh...@gmail.com> wrote:
>>
>> You should not use LocalJobRunner. Make sure that the mapred.job.tracker
>> property does not point to 'local' an instead to your job-tracker host and
>> port.
>>
>> *But before that* as Sandy said, your client machine (from where you will
>> be kicking of your jobs and apps) should be using config files which will
>> have your cluster's configuration. This is the alternative that you should
>> follow if you don't want to bundle the configs for your cluster in the
>> application itself (either in java code or separate copies of relevant
>> properties set of config files.) This was something which I was suggesting
>> early on to just to get you started using your cluster instead of local
>> mode.
>>
>> By the way have you seen the following link? It gives you step by step
>> information about how to generate config files from your cluster specific
>> to your cluster and then how to place them and use the from any machine
>> you
>> want to designate as your client. Running your jobs form one of the
>> datanodes without proper config would not work.
>>
>> https://ccp.cloudera.com/display/FREE373/Generating+Client+Configuration
>>
>> Regards,
>> Shahab
>>
>>
>> On Tue, Aug 13, 2013 at 1:07 PM, Pavan Sudheendra
>> <pa...@gmail.com>wrote:
>>
>> Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the
>> job on one datanode..
>>
>> What changes should i make so that my application would take advantage
>> of the cluster as a whole?
>>
>> On Tue, Aug 13, 2013 at 10:33 PM,  <sa...@cloudera.com> wrote:
>>
>> Nothing in your pom.xml should affect the configurations your job runs
>>
>> with.
>>
>>
>> Are you running your job from a node on the cluster? When you say
>>
>> localhost configurations, do you mean it's using the LocalJobRunner?
>>
>>
>> -sandy
>>
>> (iphnoe tpying)
>>
>> On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra <pa...@gmail.com>
>>
>> wrote:
>>
>>
>> When i actually run the job on the multi node cluster, logs shows it
>> uses localhost configurations which i don't want..
>>
>> I just have a pom.xml which lists all the dependencies like standard
>> hadoop, standard hbase, standard zookeeper etc., Should i remove these
>> dependencies?
>>
>> I want the cluster settings to apply in my map-reduce application..
>> So, this is where i'm stuck at..
>>
>> On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra <pa...@gmail.com>
>>
>> wrote:
>>
>> Hi Shabab and Sandy,
>> The thing is we have a 6 node cloudera cluster running.. For
>> development purposes, i was building a map-reduce application on a
>> single node apache distribution hadoop with maven..
>>
>> To be frank, i don't know how to deploy this application on a multi
>> node cloudera cluster. I am fairly well versed with Multi Node Apache
>> Hadoop Distribution.. So, how can i go forward?
>>
>> Thanks for all the help :)
>>
>> On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
>>
>> Hi Pavan,
>>
>> Configuration properties generally aren't included in the jar itself
>>
>> unless you explicitly set them in your java code. Rather they're picked up
>> from the mapred-site.xml file located in the Hadoop configuration
>> directory
>> on the host you're running your job from.
>>
>>
>> Is there an issue you're coming up against when trying to run your
>>
>> job on a cluster?
>>
>>
>> -Sandy
>>
>> (iphnoe tpying)
>>
>> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com>
>>
>> wrote:
>>
>>
>> Hi,
>> I'm currently using maven to build the jars necessary for my
>> map-reduce program to run and it works for a single node cluster..
>>
>> For a multi node cluster, how do i specify my map-reduce program to
>> ingest the cluster settings instead of localhost settings?
>> I don't know how to specify this using maven to build my jar.
>>
>> I'm using the cdh distribution by the way..
>> --
>> Regards-
>> Pavan
>>
>>
>>
>>
>> --
>> Regards-
>> Pavan
>>
>>
>>
>>
>> --
>> Regards-
>> Pavan
>>
>>
>>
>>
>> --
>> Regards-
>> Pavan
>>
>>
>> Dr. Brad J. Cox    Cell: 703-594-1883 Blog: http://bradjcox.blogspot.com
>> http://virtualschool.edu
>>
>>
>>
>>
>
>
>
> --
> http://hortonworks.com/download/
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader of
> this message is not the intended recipient, you are hereby notified that any
> printing, copying, dissemination, distribution, disclosure or forwarding of
> this communication is strictly prohibited. If you have received this
> communication in error, please contact the sender immediately and delete it
> from your system. Thank You.



-- 
Regards-
Pavan

Re: Maven Cloudera Configuration problem

Posted by Pavan Sudheendra <pa...@gmail.com>.

Here are the log details when i run the jar file:


08:10:29,738  INFO ZooKeeper:438 - Initiating client connection,
connectString=localhost:2181 sessionTimeout=180000 watcher=hconnection
08:10:29,777  INFO RecoverableZooKeeper:104 - The identifier of this
process is 12909@xx-xxxx-xxx-xx.eu-west-
1.compute.internal
08:10:29,784  INFO ClientCnxn:966 - Opening socket connection to
server localhost/127.0.0.1:2181. Will not attempt to authenticate
using SASL (Unable to locate a login configuration)
08:10:29,796  INFO ClientCnxn:849 - Socket connection established to
localhost/127.0.0.1:2181, initiating session
08:10:29,804  INFO ClientCnxn:1207 - Session establishment complete on
server localhost/127.0.0.1:2181, sessionid = 0x13ff1cff71b5503,
negotiated timeout = 60000
08:10:29,905  WARN Configuration:824 - hadoop.native.lib is
deprecated. Instead, use io.native.lib.available

Is it utilizing the cluster? Sorry for a noob question.

On Wed, Aug 14, 2013 at 5:24 AM, Suresh Srinivas <su...@hortonworks.com> wrote:
> Folks, can you please take this thread to CDH related mailing list?
>
>
> On Tue, Aug 13, 2013 at 3:07 PM, Brad Cox <br...@gmail.com> wrote:
>>
>> That link got my hopes up. But Cloudera Manager  (what I'm running; on
>> CDH4) does not offer an "Export Client Config" option. What am I missing?
>>
>> On Aug 13, 2013, at 4:04 PM, Shahab Yunus <sh...@gmail.com> wrote:
>>
>> You should not use LocalJobRunner. Make sure that the mapred.job.tracker
>> property does not point to 'local' an instead to your job-tracker host and
>> port.
>>
>> *But before that* as Sandy said, your client machine (from where you will
>> be kicking of your jobs and apps) should be using config files which will
>> have your cluster's configuration. This is the alternative that you should
>> follow if you don't want to bundle the configs for your cluster in the
>> application itself (either in java code or separate copies of relevant
>> properties set of config files.) This was something which I was suggesting
>> early on to just to get you started using your cluster instead of local
>> mode.
>>
>> By the way have you seen the following link? It gives you step by step
>> information about how to generate config files from your cluster specific
>> to your cluster and then how to place them and use the from any machine
>> you
>> want to designate as your client. Running your jobs form one of the
>> datanodes without proper config would not work.
>>
>> https://ccp.cloudera.com/display/FREE373/Generating+Client+Configuration
>>
>> Regards,
>> Shahab
>>
>>
>> On Tue, Aug 13, 2013 at 1:07 PM, Pavan Sudheendra
>> <pa...@gmail.com>wrote:
>>
>> Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the
>> job on one datanode..
>>
>> What changes should i make so that my application would take advantage
>> of the cluster as a whole?
>>
>> On Tue, Aug 13, 2013 at 10:33 PM,  <sa...@cloudera.com> wrote:
>>
>> Nothing in your pom.xml should affect the configurations your job runs
>>
>> with.
>>
>>
>> Are you running your job from a node on the cluster? When you say
>>
>> localhost configurations, do you mean it's using the LocalJobRunner?
>>
>>
>> -sandy
>>
>> (iphnoe tpying)
>>
>> On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra <pa...@gmail.com>
>>
>> wrote:
>>
>>
>> When i actually run the job on the multi node cluster, logs shows it
>> uses localhost configurations which i don't want..
>>
>> I just have a pom.xml which lists all the dependencies like standard
>> hadoop, standard hbase, standard zookeeper etc., Should i remove these
>> dependencies?
>>
>> I want the cluster settings to apply in my map-reduce application..
>> So, this is where i'm stuck at..
>>
>> On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra <pa...@gmail.com>
>>
>> wrote:
>>
>> Hi Shabab and Sandy,
>> The thing is we have a 6 node cloudera cluster running.. For
>> development purposes, i was building a map-reduce application on a
>> single node apache distribution hadoop with maven..
>>
>> To be frank, i don't know how to deploy this application on a multi
>> node cloudera cluster. I am fairly well versed with Multi Node Apache
>> Hadoop Distribution.. So, how can i go forward?
>>
>> Thanks for all the help :)
>>
>> On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
>>
>> Hi Pavan,
>>
>> Configuration properties generally aren't included in the jar itself
>>
>> unless you explicitly set them in your java code. Rather they're picked up
>> from the mapred-site.xml file located in the Hadoop configuration
>> directory
>> on the host you're running your job from.
>>
>>
>> Is there an issue you're coming up against when trying to run your
>>
>> job on a cluster?
>>
>>
>> -Sandy
>>
>> (iphnoe tpying)
>>
>> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com>
>>
>> wrote:
>>
>>
>> Hi,
>> I'm currently using maven to build the jars necessary for my
>> map-reduce program to run and it works for a single node cluster..
>>
>> For a multi node cluster, how do i specify my map-reduce program to
>> ingest the cluster settings instead of localhost settings?
>> I don't know how to specify this using maven to build my jar.
>>
>> I'm using the cdh distribution by the way..
>> --
>> Regards-
>> Pavan
>>
>>
>>
>>
>> --
>> Regards-
>> Pavan
>>
>>
>>
>>
>> --
>> Regards-
>> Pavan
>>
>>
>>
>>
>> --
>> Regards-
>> Pavan
>>
>>
>> Dr. Brad J. Cox    Cell: 703-594-1883 Blog: http://bradjcox.blogspot.com
>> http://virtualschool.edu
>>
>>
>>
>>
>
>
>
> --
> http://hortonworks.com/download/
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader of
> this message is not the intended recipient, you are hereby notified that any
> printing, copying, dissemination, distribution, disclosure or forwarding of
> this communication is strictly prohibited. If you have received this
> communication in error, please contact the sender immediately and delete it
> from your system. Thank You.



-- 
Regards-
Pavan

Re: Maven Cloudera Configuration problem

Posted by Pavan Sudheendra <pa...@gmail.com>.

Here are the log details when i run the jar file:


08:10:29,738  INFO ZooKeeper:438 - Initiating client connection,
connectString=localhost:2181 sessionTimeout=180000 watcher=hconnection
08:10:29,777  INFO RecoverableZooKeeper:104 - The identifier of this
process is 12909@xx-xxxx-xxx-xx.eu-west-
1.compute.internal
08:10:29,784  INFO ClientCnxn:966 - Opening socket connection to
server localhost/127.0.0.1:2181. Will not attempt to authenticate
using SASL (Unable to locate a login configuration)
08:10:29,796  INFO ClientCnxn:849 - Socket connection established to
localhost/127.0.0.1:2181, initiating session
08:10:29,804  INFO ClientCnxn:1207 - Session establishment complete on
server localhost/127.0.0.1:2181, sessionid = 0x13ff1cff71b5503,
negotiated timeout = 60000
08:10:29,905  WARN Configuration:824 - hadoop.native.lib is
deprecated. Instead, use io.native.lib.available

Is it utilizing the cluster? Sorry for a noob question.

On Wed, Aug 14, 2013 at 5:24 AM, Suresh Srinivas <su...@hortonworks.com> wrote:
> Folks, can you please take this thread to CDH related mailing list?
>
>
> On Tue, Aug 13, 2013 at 3:07 PM, Brad Cox <br...@gmail.com> wrote:
>>
>> That link got my hopes up. But Cloudera Manager  (what I'm running; on
>> CDH4) does not offer an "Export Client Config" option. What am I missing?
>>
>> On Aug 13, 2013, at 4:04 PM, Shahab Yunus <sh...@gmail.com> wrote:
>>
>> You should not use LocalJobRunner. Make sure that the mapred.job.tracker
>> property does not point to 'local' an instead to your job-tracker host and
>> port.
>>
>> *But before that* as Sandy said, your client machine (from where you will
>> be kicking of your jobs and apps) should be using config files which will
>> have your cluster's configuration. This is the alternative that you should
>> follow if you don't want to bundle the configs for your cluster in the
>> application itself (either in java code or separate copies of relevant
>> properties set of config files.) This was something which I was suggesting
>> early on to just to get you started using your cluster instead of local
>> mode.
>>
>> By the way have you seen the following link? It gives you step by step
>> information about how to generate config files from your cluster specific
>> to your cluster and then how to place them and use the from any machine
>> you
>> want to designate as your client. Running your jobs form one of the
>> datanodes without proper config would not work.
>>
>> https://ccp.cloudera.com/display/FREE373/Generating+Client+Configuration
>>
>> Regards,
>> Shahab
>>
>>
>> On Tue, Aug 13, 2013 at 1:07 PM, Pavan Sudheendra
>> <pa...@gmail.com>wrote:
>>
>> Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the
>> job on one datanode..
>>
>> What changes should i make so that my application would take advantage
>> of the cluster as a whole?
>>
>> On Tue, Aug 13, 2013 at 10:33 PM,  <sa...@cloudera.com> wrote:
>>
>> Nothing in your pom.xml should affect the configurations your job runs
>>
>> with.
>>
>>
>> Are you running your job from a node on the cluster? When you say
>>
>> localhost configurations, do you mean it's using the LocalJobRunner?
>>
>>
>> -sandy
>>
>> (iphnoe tpying)
>>
>> On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra <pa...@gmail.com>
>>
>> wrote:
>>
>>
>> When i actually run the job on the multi node cluster, logs shows it
>> uses localhost configurations which i don't want..
>>
>> I just have a pom.xml which lists all the dependencies like standard
>> hadoop, standard hbase, standard zookeeper etc., Should i remove these
>> dependencies?
>>
>> I want the cluster settings to apply in my map-reduce application..
>> So, this is where i'm stuck at..
>>
>> On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra <pa...@gmail.com>
>>
>> wrote:
>>
>> Hi Shabab and Sandy,
>> The thing is we have a 6 node cloudera cluster running.. For
>> development purposes, i was building a map-reduce application on a
>> single node apache distribution hadoop with maven..
>>
>> To be frank, i don't know how to deploy this application on a multi
>> node cloudera cluster. I am fairly well versed with Multi Node Apache
>> Hadoop Distribution.. So, how can i go forward?
>>
>> Thanks for all the help :)
>>
>> On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
>>
>> Hi Pavan,
>>
>> Configuration properties generally aren't included in the jar itself
>>
>> unless you explicitly set them in your java code. Rather they're picked up
>> from the mapred-site.xml file located in the Hadoop configuration
>> directory
>> on the host you're running your job from.
>>
>>
>> Is there an issue you're coming up against when trying to run your
>>
>> job on a cluster?
>>
>>
>> -Sandy
>>
>> (iphnoe tpying)
>>
>> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com>
>>
>> wrote:
>>
>>
>> Hi,
>> I'm currently using maven to build the jars necessary for my
>> map-reduce program to run and it works for a single node cluster..
>>
>> For a multi node cluster, how do i specify my map-reduce program to
>> ingest the cluster settings instead of localhost settings?
>> I don't know how to specify this using maven to build my jar.
>>
>> I'm using the cdh distribution by the way..
>> --
>> Regards-
>> Pavan
>>
>>
>>
>>
>> --
>> Regards-
>> Pavan
>>
>>
>>
>>
>> --
>> Regards-
>> Pavan
>>
>>
>>
>>
>> --
>> Regards-
>> Pavan
>>
>>
>> Dr. Brad J. Cox    Cell: 703-594-1883 Blog: http://bradjcox.blogspot.com
>> http://virtualschool.edu
>>
>>
>>
>>
>
>
>
> --
> http://hortonworks.com/download/
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader of
> this message is not the intended recipient, you are hereby notified that any
> printing, copying, dissemination, distribution, disclosure or forwarding of
> this communication is strictly prohibited. If you have received this
> communication in error, please contact the sender immediately and delete it
> from your system. Thank You.



-- 
Regards-
Pavan

Re: Maven Cloudera Configuration problem

Posted by Suresh Srinivas <su...@hortonworks.com>.

Folks, can you please take this thread to CDH related mailing list?


On Tue, Aug 13, 2013 at 3:07 PM, Brad Cox <br...@gmail.com> wrote:

> That link got my hopes up. But Cloudera Manager  (what I'm running; on
> CDH4) does not offer an "Export Client Config" option. What am I missing?
>
> On Aug 13, 2013, at 4:04 PM, Shahab Yunus <sh...@gmail.com> wrote:
>
> You should not use LocalJobRunner. Make sure that the mapred.job.tracker
> property does not point to 'local' an instead to your job-tracker host and
> port.
>
> *But before that* as Sandy said, your client machine (from where you will
> be kicking of your jobs and apps) should be using config files which will
> have your cluster's configuration. This is the alternative that you should
> follow if you don't want to bundle the configs for your cluster in the
> application itself (either in java code or separate copies of relevant
> properties set of config files.) This was something which I was suggesting
> early on to just to get you started using your cluster instead of local
> mode.
>
> By the way have you seen the following link? It gives you step by step
> information about how to generate config files from your cluster specific
> to your cluster and then how to place them and use the from any machine you
> want to designate as your client. Running your jobs form one of the
> datanodes without proper config would not work.
>
> https://ccp.cloudera.com/display/FREE373/Generating+Client+Configuration
>
> Regards,
> Shahab
>
>
> On Tue, Aug 13, 2013 at 1:07 PM, Pavan Sudheendra <pavan0591@gmail.com
> >wrote:
>
> Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the
> job on one datanode..
>
> What changes should i make so that my application would take advantage
> of the cluster as a whole?
>
> On Tue, Aug 13, 2013 at 10:33 PM,  <sa...@cloudera.com> wrote:
>
> Nothing in your pom.xml should affect the configurations your job runs
>
> with.
>
>
> Are you running your job from a node on the cluster? When you say
>
> localhost configurations, do you mean it's using the LocalJobRunner?
>
>
> -sandy
>
> (iphnoe tpying)
>
> On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra <pa...@gmail.com>
>
> wrote:
>
>
> When i actually run the job on the multi node cluster, logs shows it
> uses localhost configurations which i don't want..
>
> I just have a pom.xml which lists all the dependencies like standard
> hadoop, standard hbase, standard zookeeper etc., Should i remove these
> dependencies?
>
> I want the cluster settings to apply in my map-reduce application..
> So, this is where i'm stuck at..
>
> On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra <pa...@gmail.com>
>
> wrote:
>
> Hi Shabab and Sandy,
> The thing is we have a 6 node cloudera cluster running.. For
> development purposes, i was building a map-reduce application on a
> single node apache distribution hadoop with maven..
>
> To be frank, i don't know how to deploy this application on a multi
> node cloudera cluster. I am fairly well versed with Multi Node Apache
> Hadoop Distribution.. So, how can i go forward?
>
> Thanks for all the help :)
>
> On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
>
> Hi Pavan,
>
> Configuration properties generally aren't included in the jar itself
>
> unless you explicitly set them in your java code. Rather they're picked up
> from the mapred-site.xml file located in the Hadoop configuration directory
> on the host you're running your job from.
>
>
> Is there an issue you're coming up against when trying to run your
>
> job on a cluster?
>
>
> -Sandy
>
> (iphnoe tpying)
>
> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com>
>
> wrote:
>
>
> Hi,
> I'm currently using maven to build the jars necessary for my
> map-reduce program to run and it works for a single node cluster..
>
> For a multi node cluster, how do i specify my map-reduce program to
> ingest the cluster settings instead of localhost settings?
> I don't know how to specify this using maven to build my jar.
>
> I'm using the cdh distribution by the way..
> --
> Regards-
> Pavan
>
>
>
>
> --
> Regards-
> Pavan
>
>
>
>
> --
> Regards-
> Pavan
>
>
>
>
> --
> Regards-
> Pavan
>
>
> Dr. Brad J. Cox    Cell: 703-594-1883 Blog: http://bradjcox.blogspot.com
> http://virtualschool.edu
>
>
>
>
>


-- 
http://hortonworks.com/download/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Maven Cloudera Configuration problem

Posted by Shahab Yunus <sh...@gmail.com>.

In our Clouder 4.2.0 cluster, Iog-in with *admin* user (do you have
appropriate permissions by the way?) Then I click on any one of the 3
services (hbase, mapred, hdfs and excluding zookeeper) from the top-leftish
menu. Then for each of these I can click the *Configuration* tab which is
in the top-middlish section of the page. Once the configuration page opens
then I click on Action menu on the top-right. One of the sub-menu of this
is *Download Client Configuration* which as the name says downloads the
config files (zip file to be exact) to be used at client machines.

Regards,
Shahab


On Tue, Aug 13, 2013 at 6:07 PM, Brad Cox <br...@gmail.com> wrote:

> That link got my hopes up. But Cloudera Manager  (what I'm running; on
> CDH4) does not offer an "Export Client Config" option. What am I missing?
>
> On Aug 13, 2013, at 4:04 PM, Shahab Yunus <sh...@gmail.com> wrote:
>
> You should not use LocalJobRunner. Make sure that the mapred.job.tracker
> property does not point to 'local' an instead to your job-tracker host and
> port.
>
> *But before that* as Sandy said, your client machine (from where you will
> be kicking of your jobs and apps) should be using config files which will
> have your cluster's configuration. This is the alternative that you should
> follow if you don't want to bundle the configs for your cluster in the
> application itself (either in java code or separate copies of relevant
> properties set of config files.) This was something which I was suggesting
> early on to just to get you started using your cluster instead of local
> mode.
>
> By the way have you seen the following link? It gives you step by step
> information about how to generate config files from your cluster specific
> to your cluster and then how to place them and use the from any machine you
> want to designate as your client. Running your jobs form one of the
> datanodes without proper config would not work.
>
> https://ccp.cloudera.com/display/FREE373/Generating+Client+Configuration
>
> Regards,
> Shahab
>
>
> On Tue, Aug 13, 2013 at 1:07 PM, Pavan Sudheendra <pavan0591@gmail.com
> >wrote:
>
> Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the
> job on one datanode..
>
> What changes should i make so that my application would take advantage
> of the cluster as a whole?
>
> On Tue, Aug 13, 2013 at 10:33 PM,  <sa...@cloudera.com> wrote:
>
> Nothing in your pom.xml should affect the configurations your job runs
>
> with.
>
>
> Are you running your job from a node on the cluster? When you say
>
> localhost configurations, do you mean it's using the LocalJobRunner?
>
>
> -sandy
>
> (iphnoe tpying)
>
> On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra <pa...@gmail.com>
>
> wrote:
>
>
> When i actually run the job on the multi node cluster, logs shows it
> uses localhost configurations which i don't want..
>
> I just have a pom.xml which lists all the dependencies like standard
> hadoop, standard hbase, standard zookeeper etc., Should i remove these
> dependencies?
>
> I want the cluster settings to apply in my map-reduce application..
> So, this is where i'm stuck at..
>
> On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra <pa...@gmail.com>
>
> wrote:
>
> Hi Shabab and Sandy,
> The thing is we have a 6 node cloudera cluster running.. For
> development purposes, i was building a map-reduce application on a
> single node apache distribution hadoop with maven..
>
> To be frank, i don't know how to deploy this application on a multi
> node cloudera cluster. I am fairly well versed with Multi Node Apache
> Hadoop Distribution.. So, how can i go forward?
>
> Thanks for all the help :)
>
> On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
>
> Hi Pavan,
>
> Configuration properties generally aren't included in the jar itself
>
> unless you explicitly set them in your java code. Rather they're picked up
> from the mapred-site.xml file located in the Hadoop configuration directory
> on the host you're running your job from.
>
>
> Is there an issue you're coming up against when trying to run your
>
> job on a cluster?
>
>
> -Sandy
>
> (iphnoe tpying)
>
> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com>
>
> wrote:
>
>
> Hi,
> I'm currently using maven to build the jars necessary for my
> map-reduce program to run and it works for a single node cluster..
>
> For a multi node cluster, how do i specify my map-reduce program to
> ingest the cluster settings instead of localhost settings?
> I don't know how to specify this using maven to build my jar.
>
> I'm using the cdh distribution by the way..
> --
> Regards-
> Pavan
>
>
>
>
> --
> Regards-
> Pavan
>
>
>
>
> --
> Regards-
> Pavan
>
>
>
>
> --
> Regards-
> Pavan
>
>
> Dr. Brad J. Cox    Cell: 703-594-1883 Blog: http://bradjcox.blogspot.com
> http://virtualschool.edu
>
>
>
>
>

Re: Maven Cloudera Configuration problem

Posted by Shahab Yunus <sh...@gmail.com>.

In our Clouder 4.2.0 cluster, Iog-in with *admin* user (do you have
appropriate permissions by the way?) Then I click on any one of the 3
services (hbase, mapred, hdfs and excluding zookeeper) from the top-leftish
menu. Then for each of these I can click the *Configuration* tab which is
in the top-middlish section of the page. Once the configuration page opens
then I click on Action menu on the top-right. One of the sub-menu of this
is *Download Client Configuration* which as the name says downloads the
config files (zip file to be exact) to be used at client machines.

Regards,
Shahab


On Tue, Aug 13, 2013 at 6:07 PM, Brad Cox <br...@gmail.com> wrote:

> That link got my hopes up. But Cloudera Manager  (what I'm running; on
> CDH4) does not offer an "Export Client Config" option. What am I missing?
>
> On Aug 13, 2013, at 4:04 PM, Shahab Yunus <sh...@gmail.com> wrote:
>
> You should not use LocalJobRunner. Make sure that the mapred.job.tracker
> property does not point to 'local' an instead to your job-tracker host and
> port.
>
> *But before that* as Sandy said, your client machine (from where you will
> be kicking of your jobs and apps) should be using config files which will
> have your cluster's configuration. This is the alternative that you should
> follow if you don't want to bundle the configs for your cluster in the
> application itself (either in java code or separate copies of relevant
> properties set of config files.) This was something which I was suggesting
> early on to just to get you started using your cluster instead of local
> mode.
>
> By the way have you seen the following link? It gives you step by step
> information about how to generate config files from your cluster specific
> to your cluster and then how to place them and use the from any machine you
> want to designate as your client. Running your jobs form one of the
> datanodes without proper config would not work.
>
> https://ccp.cloudera.com/display/FREE373/Generating+Client+Configuration
>
> Regards,
> Shahab
>
>
> On Tue, Aug 13, 2013 at 1:07 PM, Pavan Sudheendra <pavan0591@gmail.com
> >wrote:
>
> Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the
> job on one datanode..
>
> What changes should i make so that my application would take advantage
> of the cluster as a whole?
>
> On Tue, Aug 13, 2013 at 10:33 PM,  <sa...@cloudera.com> wrote:
>
> Nothing in your pom.xml should affect the configurations your job runs
>
> with.
>
>
> Are you running your job from a node on the cluster? When you say
>
> localhost configurations, do you mean it's using the LocalJobRunner?
>
>
> -sandy
>
> (iphnoe tpying)
>
> On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra <pa...@gmail.com>
>
> wrote:
>
>
> When i actually run the job on the multi node cluster, logs shows it
> uses localhost configurations which i don't want..
>
> I just have a pom.xml which lists all the dependencies like standard
> hadoop, standard hbase, standard zookeeper etc., Should i remove these
> dependencies?
>
> I want the cluster settings to apply in my map-reduce application..
> So, this is where i'm stuck at..
>
> On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra <pa...@gmail.com>
>
> wrote:
>
> Hi Shabab and Sandy,
> The thing is we have a 6 node cloudera cluster running.. For
> development purposes, i was building a map-reduce application on a
> single node apache distribution hadoop with maven..
>
> To be frank, i don't know how to deploy this application on a multi
> node cloudera cluster. I am fairly well versed with Multi Node Apache
> Hadoop Distribution.. So, how can i go forward?
>
> Thanks for all the help :)
>
> On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
>
> Hi Pavan,
>
> Configuration properties generally aren't included in the jar itself
>
> unless you explicitly set them in your java code. Rather they're picked up
> from the mapred-site.xml file located in the Hadoop configuration directory
> on the host you're running your job from.
>
>
> Is there an issue you're coming up against when trying to run your
>
> job on a cluster?
>
>
> -Sandy
>
> (iphnoe tpying)
>
> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com>
>
> wrote:
>
>
> Hi,
> I'm currently using maven to build the jars necessary for my
> map-reduce program to run and it works for a single node cluster..
>
> For a multi node cluster, how do i specify my map-reduce program to
> ingest the cluster settings instead of localhost settings?
> I don't know how to specify this using maven to build my jar.
>
> I'm using the cdh distribution by the way..
> --
> Regards-
> Pavan
>
>
>
>
> --
> Regards-
> Pavan
>
>
>
>
> --
> Regards-
> Pavan
>
>
>
>
> --
> Regards-
> Pavan
>
>
> Dr. Brad J. Cox    Cell: 703-594-1883 Blog: http://bradjcox.blogspot.com
> http://virtualschool.edu
>
>
>
>
>

Re: Maven Cloudera Configuration problem

Posted by Suresh Srinivas <su...@hortonworks.com>.

Folks, can you please take this thread to CDH related mailing list?


On Tue, Aug 13, 2013 at 3:07 PM, Brad Cox <br...@gmail.com> wrote:

> That link got my hopes up. But Cloudera Manager  (what I'm running; on
> CDH4) does not offer an "Export Client Config" option. What am I missing?
>
> On Aug 13, 2013, at 4:04 PM, Shahab Yunus <sh...@gmail.com> wrote:
>
> You should not use LocalJobRunner. Make sure that the mapred.job.tracker
> property does not point to 'local' an instead to your job-tracker host and
> port.
>
> *But before that* as Sandy said, your client machine (from where you will
> be kicking of your jobs and apps) should be using config files which will
> have your cluster's configuration. This is the alternative that you should
> follow if you don't want to bundle the configs for your cluster in the
> application itself (either in java code or separate copies of relevant
> properties set of config files.) This was something which I was suggesting
> early on to just to get you started using your cluster instead of local
> mode.
>
> By the way have you seen the following link? It gives you step by step
> information about how to generate config files from your cluster specific
> to your cluster and then how to place them and use the from any machine you
> want to designate as your client. Running your jobs form one of the
> datanodes without proper config would not work.
>
> https://ccp.cloudera.com/display/FREE373/Generating+Client+Configuration
>
> Regards,
> Shahab
>
>
> On Tue, Aug 13, 2013 at 1:07 PM, Pavan Sudheendra <pavan0591@gmail.com
> >wrote:
>
> Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the
> job on one datanode..
>
> What changes should i make so that my application would take advantage
> of the cluster as a whole?
>
> On Tue, Aug 13, 2013 at 10:33 PM,  <sa...@cloudera.com> wrote:
>
> Nothing in your pom.xml should affect the configurations your job runs
>
> with.
>
>
> Are you running your job from a node on the cluster? When you say
>
> localhost configurations, do you mean it's using the LocalJobRunner?
>
>
> -sandy
>
> (iphnoe tpying)
>
> On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra <pa...@gmail.com>
>
> wrote:
>
>
> When i actually run the job on the multi node cluster, logs shows it
> uses localhost configurations which i don't want..
>
> I just have a pom.xml which lists all the dependencies like standard
> hadoop, standard hbase, standard zookeeper etc., Should i remove these
> dependencies?
>
> I want the cluster settings to apply in my map-reduce application..
> So, this is where i'm stuck at..
>
> On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra <pa...@gmail.com>
>
> wrote:
>
> Hi Shabab and Sandy,
> The thing is we have a 6 node cloudera cluster running.. For
> development purposes, i was building a map-reduce application on a
> single node apache distribution hadoop with maven..
>
> To be frank, i don't know how to deploy this application on a multi
> node cloudera cluster. I am fairly well versed with Multi Node Apache
> Hadoop Distribution.. So, how can i go forward?
>
> Thanks for all the help :)
>
> On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
>
> Hi Pavan,
>
> Configuration properties generally aren't included in the jar itself
>
> unless you explicitly set them in your java code. Rather they're picked up
> from the mapred-site.xml file located in the Hadoop configuration directory
> on the host you're running your job from.
>
>
> Is there an issue you're coming up against when trying to run your
>
> job on a cluster?
>
>
> -Sandy
>
> (iphnoe tpying)
>
> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com>
>
> wrote:
>
>
> Hi,
> I'm currently using maven to build the jars necessary for my
> map-reduce program to run and it works for a single node cluster..
>
> For a multi node cluster, how do i specify my map-reduce program to
> ingest the cluster settings instead of localhost settings?
> I don't know how to specify this using maven to build my jar.
>
> I'm using the cdh distribution by the way..
> --
> Regards-
> Pavan
>
>
>
>
> --
> Regards-
> Pavan
>
>
>
>
> --
> Regards-
> Pavan
>
>
>
>
> --
> Regards-
> Pavan
>
>
> Dr. Brad J. Cox    Cell: 703-594-1883 Blog: http://bradjcox.blogspot.com
> http://virtualschool.edu
>
>
>
>
>


-- 
http://hortonworks.com/download/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Maven Cloudera Configuration problem

Posted by Suresh Srinivas <su...@hortonworks.com>.

Folks, can you please take this thread to CDH related mailing list?


On Tue, Aug 13, 2013 at 3:07 PM, Brad Cox <br...@gmail.com> wrote:

> That link got my hopes up. But Cloudera Manager  (what I'm running; on
> CDH4) does not offer an "Export Client Config" option. What am I missing?
>
> On Aug 13, 2013, at 4:04 PM, Shahab Yunus <sh...@gmail.com> wrote:
>
> You should not use LocalJobRunner. Make sure that the mapred.job.tracker
> property does not point to 'local' an instead to your job-tracker host and
> port.
>
> *But before that* as Sandy said, your client machine (from where you will
> be kicking of your jobs and apps) should be using config files which will
> have your cluster's configuration. This is the alternative that you should
> follow if you don't want to bundle the configs for your cluster in the
> application itself (either in java code or separate copies of relevant
> properties set of config files.) This was something which I was suggesting
> early on to just to get you started using your cluster instead of local
> mode.
>
> By the way have you seen the following link? It gives you step by step
> information about how to generate config files from your cluster specific
> to your cluster and then how to place them and use the from any machine you
> want to designate as your client. Running your jobs form one of the
> datanodes without proper config would not work.
>
> https://ccp.cloudera.com/display/FREE373/Generating+Client+Configuration
>
> Regards,
> Shahab
>
>
> On Tue, Aug 13, 2013 at 1:07 PM, Pavan Sudheendra <pavan0591@gmail.com
> >wrote:
>
> Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the
> job on one datanode..
>
> What changes should i make so that my application would take advantage
> of the cluster as a whole?
>
> On Tue, Aug 13, 2013 at 10:33 PM,  <sa...@cloudera.com> wrote:
>
> Nothing in your pom.xml should affect the configurations your job runs
>
> with.
>
>
> Are you running your job from a node on the cluster? When you say
>
> localhost configurations, do you mean it's using the LocalJobRunner?
>
>
> -sandy
>
> (iphnoe tpying)
>
> On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra <pa...@gmail.com>
>
> wrote:
>
>
> When i actually run the job on the multi node cluster, logs shows it
> uses localhost configurations which i don't want..
>
> I just have a pom.xml which lists all the dependencies like standard
> hadoop, standard hbase, standard zookeeper etc., Should i remove these
> dependencies?
>
> I want the cluster settings to apply in my map-reduce application..
> So, this is where i'm stuck at..
>
> On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra <pa...@gmail.com>
>
> wrote:
>
> Hi Shabab and Sandy,
> The thing is we have a 6 node cloudera cluster running.. For
> development purposes, i was building a map-reduce application on a
> single node apache distribution hadoop with maven..
>
> To be frank, i don't know how to deploy this application on a multi
> node cloudera cluster. I am fairly well versed with Multi Node Apache
> Hadoop Distribution.. So, how can i go forward?
>
> Thanks for all the help :)
>
> On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
>
> Hi Pavan,
>
> Configuration properties generally aren't included in the jar itself
>
> unless you explicitly set them in your java code. Rather they're picked up
> from the mapred-site.xml file located in the Hadoop configuration directory
> on the host you're running your job from.
>
>
> Is there an issue you're coming up against when trying to run your
>
> job on a cluster?
>
>
> -Sandy
>
> (iphnoe tpying)
>
> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com>
>
> wrote:
>
>
> Hi,
> I'm currently using maven to build the jars necessary for my
> map-reduce program to run and it works for a single node cluster..
>
> For a multi node cluster, how do i specify my map-reduce program to
> ingest the cluster settings instead of localhost settings?
> I don't know how to specify this using maven to build my jar.
>
> I'm using the cdh distribution by the way..
> --
> Regards-
> Pavan
>
>
>
>
> --
> Regards-
> Pavan
>
>
>
>
> --
> Regards-
> Pavan
>
>
>
>
> --
> Regards-
> Pavan
>
>
> Dr. Brad J. Cox    Cell: 703-594-1883 Blog: http://bradjcox.blogspot.com
> http://virtualschool.edu
>
>
>
>
>


-- 
http://hortonworks.com/download/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Maven Cloudera Configuration problem

Posted by Suresh Srinivas <su...@hortonworks.com>.

Folks, can you please take this thread to CDH related mailing list?


On Tue, Aug 13, 2013 at 3:07 PM, Brad Cox <br...@gmail.com> wrote:

> That link got my hopes up. But Cloudera Manager  (what I'm running; on
> CDH4) does not offer an "Export Client Config" option. What am I missing?
>
> On Aug 13, 2013, at 4:04 PM, Shahab Yunus <sh...@gmail.com> wrote:
>
> You should not use LocalJobRunner. Make sure that the mapred.job.tracker
> property does not point to 'local' an instead to your job-tracker host and
> port.
>
> *But before that* as Sandy said, your client machine (from where you will
> be kicking of your jobs and apps) should be using config files which will
> have your cluster's configuration. This is the alternative that you should
> follow if you don't want to bundle the configs for your cluster in the
> application itself (either in java code or separate copies of relevant
> properties set of config files.) This was something which I was suggesting
> early on to just to get you started using your cluster instead of local
> mode.
>
> By the way have you seen the following link? It gives you step by step
> information about how to generate config files from your cluster specific
> to your cluster and then how to place them and use the from any machine you
> want to designate as your client. Running your jobs form one of the
> datanodes without proper config would not work.
>
> https://ccp.cloudera.com/display/FREE373/Generating+Client+Configuration
>
> Regards,
> Shahab
>
>
> On Tue, Aug 13, 2013 at 1:07 PM, Pavan Sudheendra <pavan0591@gmail.com
> >wrote:
>
> Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the
> job on one datanode..
>
> What changes should i make so that my application would take advantage
> of the cluster as a whole?
>
> On Tue, Aug 13, 2013 at 10:33 PM,  <sa...@cloudera.com> wrote:
>
> Nothing in your pom.xml should affect the configurations your job runs
>
> with.
>
>
> Are you running your job from a node on the cluster? When you say
>
> localhost configurations, do you mean it's using the LocalJobRunner?
>
>
> -sandy
>
> (iphnoe tpying)
>
> On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra <pa...@gmail.com>
>
> wrote:
>
>
> When i actually run the job on the multi node cluster, logs shows it
> uses localhost configurations which i don't want..
>
> I just have a pom.xml which lists all the dependencies like standard
> hadoop, standard hbase, standard zookeeper etc., Should i remove these
> dependencies?
>
> I want the cluster settings to apply in my map-reduce application..
> So, this is where i'm stuck at..
>
> On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra <pa...@gmail.com>
>
> wrote:
>
> Hi Shabab and Sandy,
> The thing is we have a 6 node cloudera cluster running.. For
> development purposes, i was building a map-reduce application on a
> single node apache distribution hadoop with maven..
>
> To be frank, i don't know how to deploy this application on a multi
> node cloudera cluster. I am fairly well versed with Multi Node Apache
> Hadoop Distribution.. So, how can i go forward?
>
> Thanks for all the help :)
>
> On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
>
> Hi Pavan,
>
> Configuration properties generally aren't included in the jar itself
>
> unless you explicitly set them in your java code. Rather they're picked up
> from the mapred-site.xml file located in the Hadoop configuration directory
> on the host you're running your job from.
>
>
> Is there an issue you're coming up against when trying to run your
>
> job on a cluster?
>
>
> -Sandy
>
> (iphnoe tpying)
>
> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com>
>
> wrote:
>
>
> Hi,
> I'm currently using maven to build the jars necessary for my
> map-reduce program to run and it works for a single node cluster..
>
> For a multi node cluster, how do i specify my map-reduce program to
> ingest the cluster settings instead of localhost settings?
> I don't know how to specify this using maven to build my jar.
>
> I'm using the cdh distribution by the way..
> --
> Regards-
> Pavan
>
>
>
>
> --
> Regards-
> Pavan
>
>
>
>
> --
> Regards-
> Pavan
>
>
>
>
> --
> Regards-
> Pavan
>
>
> Dr. Brad J. Cox    Cell: 703-594-1883 Blog: http://bradjcox.blogspot.com
> http://virtualschool.edu
>
>
>
>
>


-- 
http://hortonworks.com/download/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Maven Cloudera Configuration problem

Posted by Shahab Yunus <sh...@gmail.com>.

In our Clouder 4.2.0 cluster, Iog-in with *admin* user (do you have
appropriate permissions by the way?) Then I click on any one of the 3
services (hbase, mapred, hdfs and excluding zookeeper) from the top-leftish
menu. Then for each of these I can click the *Configuration* tab which is
in the top-middlish section of the page. Once the configuration page opens
then I click on Action menu on the top-right. One of the sub-menu of this
is *Download Client Configuration* which as the name says downloads the
config files (zip file to be exact) to be used at client machines.

Regards,
Shahab


On Tue, Aug 13, 2013 at 6:07 PM, Brad Cox <br...@gmail.com> wrote:

> That link got my hopes up. But Cloudera Manager  (what I'm running; on
> CDH4) does not offer an "Export Client Config" option. What am I missing?
>
> On Aug 13, 2013, at 4:04 PM, Shahab Yunus <sh...@gmail.com> wrote:
>
> You should not use LocalJobRunner. Make sure that the mapred.job.tracker
> property does not point to 'local' an instead to your job-tracker host and
> port.
>
> *But before that* as Sandy said, your client machine (from where you will
> be kicking of your jobs and apps) should be using config files which will
> have your cluster's configuration. This is the alternative that you should
> follow if you don't want to bundle the configs for your cluster in the
> application itself (either in java code or separate copies of relevant
> properties set of config files.) This was something which I was suggesting
> early on to just to get you started using your cluster instead of local
> mode.
>
> By the way have you seen the following link? It gives you step by step
> information about how to generate config files from your cluster specific
> to your cluster and then how to place them and use the from any machine you
> want to designate as your client. Running your jobs form one of the
> datanodes without proper config would not work.
>
> https://ccp.cloudera.com/display/FREE373/Generating+Client+Configuration
>
> Regards,
> Shahab
>
>
> On Tue, Aug 13, 2013 at 1:07 PM, Pavan Sudheendra <pavan0591@gmail.com
> >wrote:
>
> Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the
> job on one datanode..
>
> What changes should i make so that my application would take advantage
> of the cluster as a whole?
>
> On Tue, Aug 13, 2013 at 10:33 PM,  <sa...@cloudera.com> wrote:
>
> Nothing in your pom.xml should affect the configurations your job runs
>
> with.
>
>
> Are you running your job from a node on the cluster? When you say
>
> localhost configurations, do you mean it's using the LocalJobRunner?
>
>
> -sandy
>
> (iphnoe tpying)
>
> On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra <pa...@gmail.com>
>
> wrote:
>
>
> When i actually run the job on the multi node cluster, logs shows it
> uses localhost configurations which i don't want..
>
> I just have a pom.xml which lists all the dependencies like standard
> hadoop, standard hbase, standard zookeeper etc., Should i remove these
> dependencies?
>
> I want the cluster settings to apply in my map-reduce application..
> So, this is where i'm stuck at..
>
> On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra <pa...@gmail.com>
>
> wrote:
>
> Hi Shabab and Sandy,
> The thing is we have a 6 node cloudera cluster running.. For
> development purposes, i was building a map-reduce application on a
> single node apache distribution hadoop with maven..
>
> To be frank, i don't know how to deploy this application on a multi
> node cloudera cluster. I am fairly well versed with Multi Node Apache
> Hadoop Distribution.. So, how can i go forward?
>
> Thanks for all the help :)
>
> On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
>
> Hi Pavan,
>
> Configuration properties generally aren't included in the jar itself
>
> unless you explicitly set them in your java code. Rather they're picked up
> from the mapred-site.xml file located in the Hadoop configuration directory
> on the host you're running your job from.
>
>
> Is there an issue you're coming up against when trying to run your
>
> job on a cluster?
>
>
> -Sandy
>
> (iphnoe tpying)
>
> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com>
>
> wrote:
>
>
> Hi,
> I'm currently using maven to build the jars necessary for my
> map-reduce program to run and it works for a single node cluster..
>
> For a multi node cluster, how do i specify my map-reduce program to
> ingest the cluster settings instead of localhost settings?
> I don't know how to specify this using maven to build my jar.
>
> I'm using the cdh distribution by the way..
> --
> Regards-
> Pavan
>
>
>
>
> --
> Regards-
> Pavan
>
>
>
>
> --
> Regards-
> Pavan
>
>
>
>
> --
> Regards-
> Pavan
>
>
> Dr. Brad J. Cox    Cell: 703-594-1883 Blog: http://bradjcox.blogspot.com
> http://virtualschool.edu
>
>
>
>
>

Re: Maven Cloudera Configuration problem

Posted by Brad Cox <br...@gmail.com>.

That link got my hopes up. But Cloudera Manager  (what I'm running; on CDH4) does not offer an "Export Client Config" option. What am I missing?

On Aug 13, 2013, at 4:04 PM, Shahab Yunus <sh...@gmail.com> wrote:

> You should not use LocalJobRunner. Make sure that the mapred.job.tracker
> property does not point to 'local' an instead to your job-tracker host and
> port.
> 
> *But before that* as Sandy said, your client machine (from where you will
> be kicking of your jobs and apps) should be using config files which will
> have your cluster's configuration. This is the alternative that you should
> follow if you don't want to bundle the configs for your cluster in the
> application itself (either in java code or separate copies of relevant
> properties set of config files.) This was something which I was suggesting
> early on to just to get you started using your cluster instead of local
> mode.
> 
> By the way have you seen the following link? It gives you step by step
> information about how to generate config files from your cluster specific
> to your cluster and then how to place them and use the from any machine you
> want to designate as your client. Running your jobs form one of the
> datanodes without proper config would not work.
> 
> https://ccp.cloudera.com/display/FREE373/Generating+Client+Configuration
> 
> Regards,
> Shahab
> 
> 
> On Tue, Aug 13, 2013 at 1:07 PM, Pavan Sudheendra <pa...@gmail.com>wrote:
> 
>> Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the
>> job on one datanode..
>> 
>> What changes should i make so that my application would take advantage
>> of the cluster as a whole?
>> 
>> On Tue, Aug 13, 2013 at 10:33 PM,  <sa...@cloudera.com> wrote:
>>> Nothing in your pom.xml should affect the configurations your job runs
>> with.
>>> 
>>> Are you running your job from a node on the cluster? When you say
>> localhost configurations, do you mean it's using the LocalJobRunner?
>>> 
>>> -sandy
>>> 
>>> (iphnoe tpying)
>>> 
>>> On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra <pa...@gmail.com>
>> wrote:
>>> 
>>>> When i actually run the job on the multi node cluster, logs shows it
>>>> uses localhost configurations which i don't want..
>>>> 
>>>> I just have a pom.xml which lists all the dependencies like standard
>>>> hadoop, standard hbase, standard zookeeper etc., Should i remove these
>>>> dependencies?
>>>> 
>>>> I want the cluster settings to apply in my map-reduce application..
>>>> So, this is where i'm stuck at..
>>>> 
>>>> On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra <pa...@gmail.com>
>> wrote:
>>>>> Hi Shabab and Sandy,
>>>>> The thing is we have a 6 node cloudera cluster running.. For
>>>>> development purposes, i was building a map-reduce application on a
>>>>> single node apache distribution hadoop with maven..
>>>>> 
>>>>> To be frank, i don't know how to deploy this application on a multi
>>>>> node cloudera cluster. I am fairly well versed with Multi Node Apache
>>>>> Hadoop Distribution.. So, how can i go forward?
>>>>> 
>>>>> Thanks for all the help :)
>>>>> 
>>>>> On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
>>>>>> Hi Pavan,
>>>>>> 
>>>>>> Configuration properties generally aren't included in the jar itself
>> unless you explicitly set them in your java code. Rather they're picked up
>> from the mapred-site.xml file located in the Hadoop configuration directory
>> on the host you're running your job from.
>>>>>> 
>>>>>> Is there an issue you're coming up against when trying to run your
>> job on a cluster?
>>>>>> 
>>>>>> -Sandy
>>>>>> 
>>>>>> (iphnoe tpying)
>>>>>> 
>>>>>> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com>
>> wrote:
>>>>>> 
>>>>>>> Hi,
>>>>>>> I'm currently using maven to build the jars necessary for my
>>>>>>> map-reduce program to run and it works for a single node cluster..
>>>>>>> 
>>>>>>> For a multi node cluster, how do i specify my map-reduce program to
>>>>>>> ingest the cluster settings instead of localhost settings?
>>>>>>> I don't know how to specify this using maven to build my jar.
>>>>>>> 
>>>>>>> I'm using the cdh distribution by the way..
>>>>>>> --
>>>>>>> Regards-
>>>>>>> Pavan
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Regards-
>>>>> Pavan
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Regards-
>>>> Pavan
>> 
>> 
>> 
>> --
>> Regards-
>> Pavan
>> 

Dr. Brad J. Cox    Cell: 703-594-1883 Blog: http://bradjcox.blogspot.com http://virtualschool.edu

Re: Maven Cloudera Configuration problem

Posted by Brad Cox <br...@gmail.com>.

That link got my hopes up. But Cloudera Manager  (what I'm running; on CDH4) does not offer an "Export Client Config" option. What am I missing?

On Aug 13, 2013, at 4:04 PM, Shahab Yunus <sh...@gmail.com> wrote:

> You should not use LocalJobRunner. Make sure that the mapred.job.tracker
> property does not point to 'local' an instead to your job-tracker host and
> port.
> 
> *But before that* as Sandy said, your client machine (from where you will
> be kicking of your jobs and apps) should be using config files which will
> have your cluster's configuration. This is the alternative that you should
> follow if you don't want to bundle the configs for your cluster in the
> application itself (either in java code or separate copies of relevant
> properties set of config files.) This was something which I was suggesting
> early on to just to get you started using your cluster instead of local
> mode.
> 
> By the way have you seen the following link? It gives you step by step
> information about how to generate config files from your cluster specific
> to your cluster and then how to place them and use the from any machine you
> want to designate as your client. Running your jobs form one of the
> datanodes without proper config would not work.
> 
> https://ccp.cloudera.com/display/FREE373/Generating+Client+Configuration
> 
> Regards,
> Shahab
> 
> 
> On Tue, Aug 13, 2013 at 1:07 PM, Pavan Sudheendra <pa...@gmail.com>wrote:
> 
>> Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the
>> job on one datanode..
>> 
>> What changes should i make so that my application would take advantage
>> of the cluster as a whole?
>> 
>> On Tue, Aug 13, 2013 at 10:33 PM,  <sa...@cloudera.com> wrote:
>>> Nothing in your pom.xml should affect the configurations your job runs
>> with.
>>> 
>>> Are you running your job from a node on the cluster? When you say
>> localhost configurations, do you mean it's using the LocalJobRunner?
>>> 
>>> -sandy
>>> 
>>> (iphnoe tpying)
>>> 
>>> On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra <pa...@gmail.com>
>> wrote:
>>> 
>>>> When i actually run the job on the multi node cluster, logs shows it
>>>> uses localhost configurations which i don't want..
>>>> 
>>>> I just have a pom.xml which lists all the dependencies like standard
>>>> hadoop, standard hbase, standard zookeeper etc., Should i remove these
>>>> dependencies?
>>>> 
>>>> I want the cluster settings to apply in my map-reduce application..
>>>> So, this is where i'm stuck at..
>>>> 
>>>> On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra <pa...@gmail.com>
>> wrote:
>>>>> Hi Shabab and Sandy,
>>>>> The thing is we have a 6 node cloudera cluster running.. For
>>>>> development purposes, i was building a map-reduce application on a
>>>>> single node apache distribution hadoop with maven..
>>>>> 
>>>>> To be frank, i don't know how to deploy this application on a multi
>>>>> node cloudera cluster. I am fairly well versed with Multi Node Apache
>>>>> Hadoop Distribution.. So, how can i go forward?
>>>>> 
>>>>> Thanks for all the help :)
>>>>> 
>>>>> On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
>>>>>> Hi Pavan,
>>>>>> 
>>>>>> Configuration properties generally aren't included in the jar itself
>> unless you explicitly set them in your java code. Rather they're picked up
>> from the mapred-site.xml file located in the Hadoop configuration directory
>> on the host you're running your job from.
>>>>>> 
>>>>>> Is there an issue you're coming up against when trying to run your
>> job on a cluster?
>>>>>> 
>>>>>> -Sandy
>>>>>> 
>>>>>> (iphnoe tpying)
>>>>>> 
>>>>>> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com>
>> wrote:
>>>>>> 
>>>>>>> Hi,
>>>>>>> I'm currently using maven to build the jars necessary for my
>>>>>>> map-reduce program to run and it works for a single node cluster..
>>>>>>> 
>>>>>>> For a multi node cluster, how do i specify my map-reduce program to
>>>>>>> ingest the cluster settings instead of localhost settings?
>>>>>>> I don't know how to specify this using maven to build my jar.
>>>>>>> 
>>>>>>> I'm using the cdh distribution by the way..
>>>>>>> --
>>>>>>> Regards-
>>>>>>> Pavan
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Regards-
>>>>> Pavan
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Regards-
>>>> Pavan
>> 
>> 
>> 
>> --
>> Regards-
>> Pavan
>> 

Dr. Brad J. Cox    Cell: 703-594-1883 Blog: http://bradjcox.blogspot.com http://virtualschool.edu

Re: Maven Cloudera Configuration problem

Posted by Brad Cox <br...@gmail.com>.

That link got my hopes up. But Cloudera Manager  (what I'm running; on CDH4) does not offer an "Export Client Config" option. What am I missing?

On Aug 13, 2013, at 4:04 PM, Shahab Yunus <sh...@gmail.com> wrote:

> You should not use LocalJobRunner. Make sure that the mapred.job.tracker
> property does not point to 'local' an instead to your job-tracker host and
> port.
> 
> *But before that* as Sandy said, your client machine (from where you will
> be kicking of your jobs and apps) should be using config files which will
> have your cluster's configuration. This is the alternative that you should
> follow if you don't want to bundle the configs for your cluster in the
> application itself (either in java code or separate copies of relevant
> properties set of config files.) This was something which I was suggesting
> early on to just to get you started using your cluster instead of local
> mode.
> 
> By the way have you seen the following link? It gives you step by step
> information about how to generate config files from your cluster specific
> to your cluster and then how to place them and use the from any machine you
> want to designate as your client. Running your jobs form one of the
> datanodes without proper config would not work.
> 
> https://ccp.cloudera.com/display/FREE373/Generating+Client+Configuration
> 
> Regards,
> Shahab
> 
> 
> On Tue, Aug 13, 2013 at 1:07 PM, Pavan Sudheendra <pa...@gmail.com>wrote:
> 
>> Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the
>> job on one datanode..
>> 
>> What changes should i make so that my application would take advantage
>> of the cluster as a whole?
>> 
>> On Tue, Aug 13, 2013 at 10:33 PM,  <sa...@cloudera.com> wrote:
>>> Nothing in your pom.xml should affect the configurations your job runs
>> with.
>>> 
>>> Are you running your job from a node on the cluster? When you say
>> localhost configurations, do you mean it's using the LocalJobRunner?
>>> 
>>> -sandy
>>> 
>>> (iphnoe tpying)
>>> 
>>> On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra <pa...@gmail.com>
>> wrote:
>>> 
>>>> When i actually run the job on the multi node cluster, logs shows it
>>>> uses localhost configurations which i don't want..
>>>> 
>>>> I just have a pom.xml which lists all the dependencies like standard
>>>> hadoop, standard hbase, standard zookeeper etc., Should i remove these
>>>> dependencies?
>>>> 
>>>> I want the cluster settings to apply in my map-reduce application..
>>>> So, this is where i'm stuck at..
>>>> 
>>>> On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra <pa...@gmail.com>
>> wrote:
>>>>> Hi Shabab and Sandy,
>>>>> The thing is we have a 6 node cloudera cluster running.. For
>>>>> development purposes, i was building a map-reduce application on a
>>>>> single node apache distribution hadoop with maven..
>>>>> 
>>>>> To be frank, i don't know how to deploy this application on a multi
>>>>> node cloudera cluster. I am fairly well versed with Multi Node Apache
>>>>> Hadoop Distribution.. So, how can i go forward?
>>>>> 
>>>>> Thanks for all the help :)
>>>>> 
>>>>> On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
>>>>>> Hi Pavan,
>>>>>> 
>>>>>> Configuration properties generally aren't included in the jar itself
>> unless you explicitly set them in your java code. Rather they're picked up
>> from the mapred-site.xml file located in the Hadoop configuration directory
>> on the host you're running your job from.
>>>>>> 
>>>>>> Is there an issue you're coming up against when trying to run your
>> job on a cluster?
>>>>>> 
>>>>>> -Sandy
>>>>>> 
>>>>>> (iphnoe tpying)
>>>>>> 
>>>>>> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com>
>> wrote:
>>>>>> 
>>>>>>> Hi,
>>>>>>> I'm currently using maven to build the jars necessary for my
>>>>>>> map-reduce program to run and it works for a single node cluster..
>>>>>>> 
>>>>>>> For a multi node cluster, how do i specify my map-reduce program to
>>>>>>> ingest the cluster settings instead of localhost settings?
>>>>>>> I don't know how to specify this using maven to build my jar.
>>>>>>> 
>>>>>>> I'm using the cdh distribution by the way..
>>>>>>> --
>>>>>>> Regards-
>>>>>>> Pavan
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Regards-
>>>>> Pavan
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Regards-
>>>> Pavan
>> 
>> 
>> 
>> --
>> Regards-
>> Pavan
>> 

Dr. Brad J. Cox    Cell: 703-594-1883 Blog: http://bradjcox.blogspot.com http://virtualschool.edu

Re: Maven Cloudera Configuration problem

Posted by Brad Cox <br...@gmail.com>.

That link got my hopes up. But Cloudera Manager  (what I'm running; on CDH4) does not offer an "Export Client Config" option. What am I missing?

On Aug 13, 2013, at 4:04 PM, Shahab Yunus <sh...@gmail.com> wrote:

> You should not use LocalJobRunner. Make sure that the mapred.job.tracker
> property does not point to 'local' an instead to your job-tracker host and
> port.
> 
> *But before that* as Sandy said, your client machine (from where you will
> be kicking of your jobs and apps) should be using config files which will
> have your cluster's configuration. This is the alternative that you should
> follow if you don't want to bundle the configs for your cluster in the
> application itself (either in java code or separate copies of relevant
> properties set of config files.) This was something which I was suggesting
> early on to just to get you started using your cluster instead of local
> mode.
> 
> By the way have you seen the following link? It gives you step by step
> information about how to generate config files from your cluster specific
> to your cluster and then how to place them and use the from any machine you
> want to designate as your client. Running your jobs form one of the
> datanodes without proper config would not work.
> 
> https://ccp.cloudera.com/display/FREE373/Generating+Client+Configuration
> 
> Regards,
> Shahab
> 
> 
> On Tue, Aug 13, 2013 at 1:07 PM, Pavan Sudheendra <pa...@gmail.com>wrote:
> 
>> Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the
>> job on one datanode..
>> 
>> What changes should i make so that my application would take advantage
>> of the cluster as a whole?
>> 
>> On Tue, Aug 13, 2013 at 10:33 PM,  <sa...@cloudera.com> wrote:
>>> Nothing in your pom.xml should affect the configurations your job runs
>> with.
>>> 
>>> Are you running your job from a node on the cluster? When you say
>> localhost configurations, do you mean it's using the LocalJobRunner?
>>> 
>>> -sandy
>>> 
>>> (iphnoe tpying)
>>> 
>>> On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra <pa...@gmail.com>
>> wrote:
>>> 
>>>> When i actually run the job on the multi node cluster, logs shows it
>>>> uses localhost configurations which i don't want..
>>>> 
>>>> I just have a pom.xml which lists all the dependencies like standard
>>>> hadoop, standard hbase, standard zookeeper etc., Should i remove these
>>>> dependencies?
>>>> 
>>>> I want the cluster settings to apply in my map-reduce application..
>>>> So, this is where i'm stuck at..
>>>> 
>>>> On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra <pa...@gmail.com>
>> wrote:
>>>>> Hi Shabab and Sandy,
>>>>> The thing is we have a 6 node cloudera cluster running.. For
>>>>> development purposes, i was building a map-reduce application on a
>>>>> single node apache distribution hadoop with maven..
>>>>> 
>>>>> To be frank, i don't know how to deploy this application on a multi
>>>>> node cloudera cluster. I am fairly well versed with Multi Node Apache
>>>>> Hadoop Distribution.. So, how can i go forward?
>>>>> 
>>>>> Thanks for all the help :)
>>>>> 
>>>>> On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
>>>>>> Hi Pavan,
>>>>>> 
>>>>>> Configuration properties generally aren't included in the jar itself
>> unless you explicitly set them in your java code. Rather they're picked up
>> from the mapred-site.xml file located in the Hadoop configuration directory
>> on the host you're running your job from.
>>>>>> 
>>>>>> Is there an issue you're coming up against when trying to run your
>> job on a cluster?
>>>>>> 
>>>>>> -Sandy
>>>>>> 
>>>>>> (iphnoe tpying)
>>>>>> 
>>>>>> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com>
>> wrote:
>>>>>> 
>>>>>>> Hi,
>>>>>>> I'm currently using maven to build the jars necessary for my
>>>>>>> map-reduce program to run and it works for a single node cluster..
>>>>>>> 
>>>>>>> For a multi node cluster, how do i specify my map-reduce program to
>>>>>>> ingest the cluster settings instead of localhost settings?
>>>>>>> I don't know how to specify this using maven to build my jar.
>>>>>>> 
>>>>>>> I'm using the cdh distribution by the way..
>>>>>>> --
>>>>>>> Regards-
>>>>>>> Pavan
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Regards-
>>>>> Pavan
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Regards-
>>>> Pavan
>> 
>> 
>> 
>> --
>> Regards-
>> Pavan
>> 

Dr. Brad J. Cox    Cell: 703-594-1883 Blog: http://bradjcox.blogspot.com http://virtualschool.edu

Re: Maven Cloudera Configuration problem

Posted by Shahab Yunus <sh...@gmail.com>.

You should not use LocalJobRunner. Make sure that the mapred.job.tracker
property does not point to 'local' an instead to your job-tracker host and
port.

*But before that* as Sandy said, your client machine (from where you will
be kicking of your jobs and apps) should be using config files which will
have your cluster's configuration. This is the alternative that you should
follow if you don't want to bundle the configs for your cluster in the
application itself (either in java code or separate copies of relevant
properties set of config files.) This was something which I was suggesting
early on to just to get you started using your cluster instead of local
mode.

By the way have you seen the following link? It gives you step by step
information about how to generate config files from your cluster specific
to your cluster and then how to place them and use the from any machine you
want to designate as your client. Running your jobs form one of the
datanodes without proper config would not work.

https://ccp.cloudera.com/display/FREE373/Generating+Client+Configuration

Regards,
Shahab


On Tue, Aug 13, 2013 at 1:07 PM, Pavan Sudheendra <pa...@gmail.com>wrote:

> Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the
> job on one datanode..
>
> What changes should i make so that my application would take advantage
> of the cluster as a whole?
>
> On Tue, Aug 13, 2013 at 10:33 PM,  <sa...@cloudera.com> wrote:
> > Nothing in your pom.xml should affect the configurations your job runs
> with.
> >
> > Are you running your job from a node on the cluster? When you say
> localhost configurations, do you mean it's using the LocalJobRunner?
> >
> > -sandy
> >
> > (iphnoe tpying)
> >
> > On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra <pa...@gmail.com>
> wrote:
> >
> >> When i actually run the job on the multi node cluster, logs shows it
> >> uses localhost configurations which i don't want..
> >>
> >> I just have a pom.xml which lists all the dependencies like standard
> >> hadoop, standard hbase, standard zookeeper etc., Should i remove these
> >> dependencies?
> >>
> >> I want the cluster settings to apply in my map-reduce application..
> >> So, this is where i'm stuck at..
> >>
> >> On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra <pa...@gmail.com>
> wrote:
> >>> Hi Shabab and Sandy,
> >>> The thing is we have a 6 node cloudera cluster running.. For
> >>> development purposes, i was building a map-reduce application on a
> >>> single node apache distribution hadoop with maven..
> >>>
> >>> To be frank, i don't know how to deploy this application on a multi
> >>> node cloudera cluster. I am fairly well versed with Multi Node Apache
> >>> Hadoop Distribution.. So, how can i go forward?
> >>>
> >>> Thanks for all the help :)
> >>>
> >>> On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
> >>>> Hi Pavan,
> >>>>
> >>>> Configuration properties generally aren't included in the jar itself
> unless you explicitly set them in your java code. Rather they're picked up
> from the mapred-site.xml file located in the Hadoop configuration directory
> on the host you're running your job from.
> >>>>
> >>>> Is there an issue you're coming up against when trying to run your
> job on a cluster?
> >>>>
> >>>> -Sandy
> >>>>
> >>>> (iphnoe tpying)
> >>>>
> >>>> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com>
> wrote:
> >>>>
> >>>>> Hi,
> >>>>> I'm currently using maven to build the jars necessary for my
> >>>>> map-reduce program to run and it works for a single node cluster..
> >>>>>
> >>>>> For a multi node cluster, how do i specify my map-reduce program to
> >>>>> ingest the cluster settings instead of localhost settings?
> >>>>> I don't know how to specify this using maven to build my jar.
> >>>>>
> >>>>> I'm using the cdh distribution by the way..
> >>>>> --
> >>>>> Regards-
> >>>>> Pavan
> >>>
> >>>
> >>>
> >>> --
> >>> Regards-
> >>> Pavan
> >>
> >>
> >>
> >> --
> >> Regards-
> >> Pavan
>
>
>
> --
> Regards-
> Pavan
>

Re: Maven Cloudera Configuration problem

Posted by Shahab Yunus <sh...@gmail.com>.

You should not use LocalJobRunner. Make sure that the mapred.job.tracker
property does not point to 'local' an instead to your job-tracker host and
port.

*But before that* as Sandy said, your client machine (from where you will
be kicking of your jobs and apps) should be using config files which will
have your cluster's configuration. This is the alternative that you should
follow if you don't want to bundle the configs for your cluster in the
application itself (either in java code or separate copies of relevant
properties set of config files.) This was something which I was suggesting
early on to just to get you started using your cluster instead of local
mode.

By the way have you seen the following link? It gives you step by step
information about how to generate config files from your cluster specific
to your cluster and then how to place them and use the from any machine you
want to designate as your client. Running your jobs form one of the
datanodes without proper config would not work.

https://ccp.cloudera.com/display/FREE373/Generating+Client+Configuration

Regards,
Shahab


On Tue, Aug 13, 2013 at 1:07 PM, Pavan Sudheendra <pa...@gmail.com>wrote:

> Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the
> job on one datanode..
>
> What changes should i make so that my application would take advantage
> of the cluster as a whole?
>
> On Tue, Aug 13, 2013 at 10:33 PM,  <sa...@cloudera.com> wrote:
> > Nothing in your pom.xml should affect the configurations your job runs
> with.
> >
> > Are you running your job from a node on the cluster? When you say
> localhost configurations, do you mean it's using the LocalJobRunner?
> >
> > -sandy
> >
> > (iphnoe tpying)
> >
> > On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra <pa...@gmail.com>
> wrote:
> >
> >> When i actually run the job on the multi node cluster, logs shows it
> >> uses localhost configurations which i don't want..
> >>
> >> I just have a pom.xml which lists all the dependencies like standard
> >> hadoop, standard hbase, standard zookeeper etc., Should i remove these
> >> dependencies?
> >>
> >> I want the cluster settings to apply in my map-reduce application..
> >> So, this is where i'm stuck at..
> >>
> >> On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra <pa...@gmail.com>
> wrote:
> >>> Hi Shabab and Sandy,
> >>> The thing is we have a 6 node cloudera cluster running.. For
> >>> development purposes, i was building a map-reduce application on a
> >>> single node apache distribution hadoop with maven..
> >>>
> >>> To be frank, i don't know how to deploy this application on a multi
> >>> node cloudera cluster. I am fairly well versed with Multi Node Apache
> >>> Hadoop Distribution.. So, how can i go forward?
> >>>
> >>> Thanks for all the help :)
> >>>
> >>> On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
> >>>> Hi Pavan,
> >>>>
> >>>> Configuration properties generally aren't included in the jar itself
> unless you explicitly set them in your java code. Rather they're picked up
> from the mapred-site.xml file located in the Hadoop configuration directory
> on the host you're running your job from.
> >>>>
> >>>> Is there an issue you're coming up against when trying to run your
> job on a cluster?
> >>>>
> >>>> -Sandy
> >>>>
> >>>> (iphnoe tpying)
> >>>>
> >>>> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com>
> wrote:
> >>>>
> >>>>> Hi,
> >>>>> I'm currently using maven to build the jars necessary for my
> >>>>> map-reduce program to run and it works for a single node cluster..
> >>>>>
> >>>>> For a multi node cluster, how do i specify my map-reduce program to
> >>>>> ingest the cluster settings instead of localhost settings?
> >>>>> I don't know how to specify this using maven to build my jar.
> >>>>>
> >>>>> I'm using the cdh distribution by the way..
> >>>>> --
> >>>>> Regards-
> >>>>> Pavan
> >>>
> >>>
> >>>
> >>> --
> >>> Regards-
> >>> Pavan
> >>
> >>
> >>
> >> --
> >> Regards-
> >> Pavan
>
>
>
> --
> Regards-
> Pavan
>

Re: Maven Cloudera Configuration problem

Posted by Shahab Yunus <sh...@gmail.com>.

You should not use LocalJobRunner. Make sure that the mapred.job.tracker
property does not point to 'local' an instead to your job-tracker host and
port.

*But before that* as Sandy said, your client machine (from where you will
be kicking of your jobs and apps) should be using config files which will
have your cluster's configuration. This is the alternative that you should
follow if you don't want to bundle the configs for your cluster in the
application itself (either in java code or separate copies of relevant
properties set of config files.) This was something which I was suggesting
early on to just to get you started using your cluster instead of local
mode.

By the way have you seen the following link? It gives you step by step
information about how to generate config files from your cluster specific
to your cluster and then how to place them and use the from any machine you
want to designate as your client. Running your jobs form one of the
datanodes without proper config would not work.

https://ccp.cloudera.com/display/FREE373/Generating+Client+Configuration

Regards,
Shahab


On Tue, Aug 13, 2013 at 1:07 PM, Pavan Sudheendra <pa...@gmail.com>wrote:

> Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the
> job on one datanode..
>
> What changes should i make so that my application would take advantage
> of the cluster as a whole?
>
> On Tue, Aug 13, 2013 at 10:33 PM,  <sa...@cloudera.com> wrote:
> > Nothing in your pom.xml should affect the configurations your job runs
> with.
> >
> > Are you running your job from a node on the cluster? When you say
> localhost configurations, do you mean it's using the LocalJobRunner?
> >
> > -sandy
> >
> > (iphnoe tpying)
> >
> > On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra <pa...@gmail.com>
> wrote:
> >
> >> When i actually run the job on the multi node cluster, logs shows it
> >> uses localhost configurations which i don't want..
> >>
> >> I just have a pom.xml which lists all the dependencies like standard
> >> hadoop, standard hbase, standard zookeeper etc., Should i remove these
> >> dependencies?
> >>
> >> I want the cluster settings to apply in my map-reduce application..
> >> So, this is where i'm stuck at..
> >>
> >> On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra <pa...@gmail.com>
> wrote:
> >>> Hi Shabab and Sandy,
> >>> The thing is we have a 6 node cloudera cluster running.. For
> >>> development purposes, i was building a map-reduce application on a
> >>> single node apache distribution hadoop with maven..
> >>>
> >>> To be frank, i don't know how to deploy this application on a multi
> >>> node cloudera cluster. I am fairly well versed with Multi Node Apache
> >>> Hadoop Distribution.. So, how can i go forward?
> >>>
> >>> Thanks for all the help :)
> >>>
> >>> On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
> >>>> Hi Pavan,
> >>>>
> >>>> Configuration properties generally aren't included in the jar itself
> unless you explicitly set them in your java code. Rather they're picked up
> from the mapred-site.xml file located in the Hadoop configuration directory
> on the host you're running your job from.
> >>>>
> >>>> Is there an issue you're coming up against when trying to run your
> job on a cluster?
> >>>>
> >>>> -Sandy
> >>>>
> >>>> (iphnoe tpying)
> >>>>
> >>>> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com>
> wrote:
> >>>>
> >>>>> Hi,
> >>>>> I'm currently using maven to build the jars necessary for my
> >>>>> map-reduce program to run and it works for a single node cluster..
> >>>>>
> >>>>> For a multi node cluster, how do i specify my map-reduce program to
> >>>>> ingest the cluster settings instead of localhost settings?
> >>>>> I don't know how to specify this using maven to build my jar.
> >>>>>
> >>>>> I'm using the cdh distribution by the way..
> >>>>> --
> >>>>> Regards-
> >>>>> Pavan
> >>>
> >>>
> >>>
> >>> --
> >>> Regards-
> >>> Pavan
> >>
> >>
> >>
> >> --
> >> Regards-
> >> Pavan
>
>
>
> --
> Regards-
> Pavan
>

Re: Maven Cloudera Configuration problem

Posted by Shahab Yunus <sh...@gmail.com>.

You should not use LocalJobRunner. Make sure that the mapred.job.tracker
property does not point to 'local' an instead to your job-tracker host and
port.

*But before that* as Sandy said, your client machine (from where you will
be kicking of your jobs and apps) should be using config files which will
have your cluster's configuration. This is the alternative that you should
follow if you don't want to bundle the configs for your cluster in the
application itself (either in java code or separate copies of relevant
properties set of config files.) This was something which I was suggesting
early on to just to get you started using your cluster instead of local
mode.

By the way have you seen the following link? It gives you step by step
information about how to generate config files from your cluster specific
to your cluster and then how to place them and use the from any machine you
want to designate as your client. Running your jobs form one of the
datanodes without proper config would not work.

https://ccp.cloudera.com/display/FREE373/Generating+Client+Configuration

Regards,
Shahab


On Tue, Aug 13, 2013 at 1:07 PM, Pavan Sudheendra <pa...@gmail.com>wrote:

> Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the
> job on one datanode..
>
> What changes should i make so that my application would take advantage
> of the cluster as a whole?
>
> On Tue, Aug 13, 2013 at 10:33 PM,  <sa...@cloudera.com> wrote:
> > Nothing in your pom.xml should affect the configurations your job runs
> with.
> >
> > Are you running your job from a node on the cluster? When you say
> localhost configurations, do you mean it's using the LocalJobRunner?
> >
> > -sandy
> >
> > (iphnoe tpying)
> >
> > On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra <pa...@gmail.com>
> wrote:
> >
> >> When i actually run the job on the multi node cluster, logs shows it
> >> uses localhost configurations which i don't want..
> >>
> >> I just have a pom.xml which lists all the dependencies like standard
> >> hadoop, standard hbase, standard zookeeper etc., Should i remove these
> >> dependencies?
> >>
> >> I want the cluster settings to apply in my map-reduce application..
> >> So, this is where i'm stuck at..
> >>
> >> On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra <pa...@gmail.com>
> wrote:
> >>> Hi Shabab and Sandy,
> >>> The thing is we have a 6 node cloudera cluster running.. For
> >>> development purposes, i was building a map-reduce application on a
> >>> single node apache distribution hadoop with maven..
> >>>
> >>> To be frank, i don't know how to deploy this application on a multi
> >>> node cloudera cluster. I am fairly well versed with Multi Node Apache
> >>> Hadoop Distribution.. So, how can i go forward?
> >>>
> >>> Thanks for all the help :)
> >>>
> >>> On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
> >>>> Hi Pavan,
> >>>>
> >>>> Configuration properties generally aren't included in the jar itself
> unless you explicitly set them in your java code. Rather they're picked up
> from the mapred-site.xml file located in the Hadoop configuration directory
> on the host you're running your job from.
> >>>>
> >>>> Is there an issue you're coming up against when trying to run your
> job on a cluster?
> >>>>
> >>>> -Sandy
> >>>>
> >>>> (iphnoe tpying)
> >>>>
> >>>> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com>
> wrote:
> >>>>
> >>>>> Hi,
> >>>>> I'm currently using maven to build the jars necessary for my
> >>>>> map-reduce program to run and it works for a single node cluster..
> >>>>>
> >>>>> For a multi node cluster, how do i specify my map-reduce program to
> >>>>> ingest the cluster settings instead of localhost settings?
> >>>>> I don't know how to specify this using maven to build my jar.
> >>>>>
> >>>>> I'm using the cdh distribution by the way..
> >>>>> --
> >>>>> Regards-
> >>>>> Pavan
> >>>
> >>>
> >>>
> >>> --
> >>> Regards-
> >>> Pavan
> >>
> >>
> >>
> >> --
> >> Regards-
> >> Pavan
>
>
>
> --
> Regards-
> Pavan
>

Re: Maven Cloudera Configuration problem

Posted by Pavan Sudheendra <pa...@gmail.com>.

Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the
job on one datanode..

What changes should i make so that my application would take advantage
of the cluster as a whole?

On Tue, Aug 13, 2013 at 10:33 PM,  <sa...@cloudera.com> wrote:
> Nothing in your pom.xml should affect the configurations your job runs with.
>
> Are you running your job from a node on the cluster? When you say localhost configurations, do you mean it's using the LocalJobRunner?
>
> -sandy
>
> (iphnoe tpying)
>
> On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra <pa...@gmail.com> wrote:
>
>> When i actually run the job on the multi node cluster, logs shows it
>> uses localhost configurations which i don't want..
>>
>> I just have a pom.xml which lists all the dependencies like standard
>> hadoop, standard hbase, standard zookeeper etc., Should i remove these
>> dependencies?
>>
>> I want the cluster settings to apply in my map-reduce application..
>> So, this is where i'm stuck at..
>>
>> On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra <pa...@gmail.com> wrote:
>>> Hi Shabab and Sandy,
>>> The thing is we have a 6 node cloudera cluster running.. For
>>> development purposes, i was building a map-reduce application on a
>>> single node apache distribution hadoop with maven..
>>>
>>> To be frank, i don't know how to deploy this application on a multi
>>> node cloudera cluster. I am fairly well versed with Multi Node Apache
>>> Hadoop Distribution.. So, how can i go forward?
>>>
>>> Thanks for all the help :)
>>>
>>> On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
>>>> Hi Pavan,
>>>>
>>>> Configuration properties generally aren't included in the jar itself unless you explicitly set them in your java code. Rather they're picked up from the mapred-site.xml file located in the Hadoop configuration directory on the host you're running your job from.
>>>>
>>>> Is there an issue you're coming up against when trying to run your job on a cluster?
>>>>
>>>> -Sandy
>>>>
>>>> (iphnoe tpying)
>>>>
>>>> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>> I'm currently using maven to build the jars necessary for my
>>>>> map-reduce program to run and it works for a single node cluster..
>>>>>
>>>>> For a multi node cluster, how do i specify my map-reduce program to
>>>>> ingest the cluster settings instead of localhost settings?
>>>>> I don't know how to specify this using maven to build my jar.
>>>>>
>>>>> I'm using the cdh distribution by the way..
>>>>> --
>>>>> Regards-
>>>>> Pavan
>>>
>>>
>>>
>>> --
>>> Regards-
>>> Pavan
>>
>>
>>
>> --
>> Regards-
>> Pavan



-- 
Regards-
Pavan

Re: Maven Cloudera Configuration problem

Posted by Pavan Sudheendra <pa...@gmail.com>.

Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the
job on one datanode..

What changes should i make so that my application would take advantage
of the cluster as a whole?

On Tue, Aug 13, 2013 at 10:33 PM,  <sa...@cloudera.com> wrote:
> Nothing in your pom.xml should affect the configurations your job runs with.
>
> Are you running your job from a node on the cluster? When you say localhost configurations, do you mean it's using the LocalJobRunner?
>
> -sandy
>
> (iphnoe tpying)
>
> On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra <pa...@gmail.com> wrote:
>
>> When i actually run the job on the multi node cluster, logs shows it
>> uses localhost configurations which i don't want..
>>
>> I just have a pom.xml which lists all the dependencies like standard
>> hadoop, standard hbase, standard zookeeper etc., Should i remove these
>> dependencies?
>>
>> I want the cluster settings to apply in my map-reduce application..
>> So, this is where i'm stuck at..
>>
>> On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra <pa...@gmail.com> wrote:
>>> Hi Shabab and Sandy,
>>> The thing is we have a 6 node cloudera cluster running.. For
>>> development purposes, i was building a map-reduce application on a
>>> single node apache distribution hadoop with maven..
>>>
>>> To be frank, i don't know how to deploy this application on a multi
>>> node cloudera cluster. I am fairly well versed with Multi Node Apache
>>> Hadoop Distribution.. So, how can i go forward?
>>>
>>> Thanks for all the help :)
>>>
>>> On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
>>>> Hi Pavan,
>>>>
>>>> Configuration properties generally aren't included in the jar itself unless you explicitly set them in your java code. Rather they're picked up from the mapred-site.xml file located in the Hadoop configuration directory on the host you're running your job from.
>>>>
>>>> Is there an issue you're coming up against when trying to run your job on a cluster?
>>>>
>>>> -Sandy
>>>>
>>>> (iphnoe tpying)
>>>>
>>>> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>> I'm currently using maven to build the jars necessary for my
>>>>> map-reduce program to run and it works for a single node cluster..
>>>>>
>>>>> For a multi node cluster, how do i specify my map-reduce program to
>>>>> ingest the cluster settings instead of localhost settings?
>>>>> I don't know how to specify this using maven to build my jar.
>>>>>
>>>>> I'm using the cdh distribution by the way..
>>>>> --
>>>>> Regards-
>>>>> Pavan
>>>
>>>
>>>
>>> --
>>> Regards-
>>> Pavan
>>
>>
>>
>> --
>> Regards-
>> Pavan



-- 
Regards-
Pavan

Re: Maven Cloudera Configuration problem

Posted by Pavan Sudheendra <pa...@gmail.com>.

Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the
job on one datanode..

What changes should i make so that my application would take advantage
of the cluster as a whole?

On Tue, Aug 13, 2013 at 10:33 PM,  <sa...@cloudera.com> wrote:
> Nothing in your pom.xml should affect the configurations your job runs with.
>
> Are you running your job from a node on the cluster? When you say localhost configurations, do you mean it's using the LocalJobRunner?
>
> -sandy
>
> (iphnoe tpying)
>
> On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra <pa...@gmail.com> wrote:
>
>> When i actually run the job on the multi node cluster, logs shows it
>> uses localhost configurations which i don't want..
>>
>> I just have a pom.xml which lists all the dependencies like standard
>> hadoop, standard hbase, standard zookeeper etc., Should i remove these
>> dependencies?
>>
>> I want the cluster settings to apply in my map-reduce application..
>> So, this is where i'm stuck at..
>>
>> On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra <pa...@gmail.com> wrote:
>>> Hi Shabab and Sandy,
>>> The thing is we have a 6 node cloudera cluster running.. For
>>> development purposes, i was building a map-reduce application on a
>>> single node apache distribution hadoop with maven..
>>>
>>> To be frank, i don't know how to deploy this application on a multi
>>> node cloudera cluster. I am fairly well versed with Multi Node Apache
>>> Hadoop Distribution.. So, how can i go forward?
>>>
>>> Thanks for all the help :)
>>>
>>> On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
>>>> Hi Pavan,
>>>>
>>>> Configuration properties generally aren't included in the jar itself unless you explicitly set them in your java code. Rather they're picked up from the mapred-site.xml file located in the Hadoop configuration directory on the host you're running your job from.
>>>>
>>>> Is there an issue you're coming up against when trying to run your job on a cluster?
>>>>
>>>> -Sandy
>>>>
>>>> (iphnoe tpying)
>>>>
>>>> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>> I'm currently using maven to build the jars necessary for my
>>>>> map-reduce program to run and it works for a single node cluster..
>>>>>
>>>>> For a multi node cluster, how do i specify my map-reduce program to
>>>>> ingest the cluster settings instead of localhost settings?
>>>>> I don't know how to specify this using maven to build my jar.
>>>>>
>>>>> I'm using the cdh distribution by the way..
>>>>> --
>>>>> Regards-
>>>>> Pavan
>>>
>>>
>>>
>>> --
>>> Regards-
>>> Pavan
>>
>>
>>
>> --
>> Regards-
>> Pavan



-- 
Regards-
Pavan

Re: Maven Cloudera Configuration problem

Posted by Pavan Sudheendra <pa...@gmail.com>.

Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the
job on one datanode..

What changes should i make so that my application would take advantage
of the cluster as a whole?

On Tue, Aug 13, 2013 at 10:33 PM,  <sa...@cloudera.com> wrote:
> Nothing in your pom.xml should affect the configurations your job runs with.
>
> Are you running your job from a node on the cluster? When you say localhost configurations, do you mean it's using the LocalJobRunner?
>
> -sandy
>
> (iphnoe tpying)
>
> On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra <pa...@gmail.com> wrote:
>
>> When i actually run the job on the multi node cluster, logs shows it
>> uses localhost configurations which i don't want..
>>
>> I just have a pom.xml which lists all the dependencies like standard
>> hadoop, standard hbase, standard zookeeper etc., Should i remove these
>> dependencies?
>>
>> I want the cluster settings to apply in my map-reduce application..
>> So, this is where i'm stuck at..
>>
>> On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra <pa...@gmail.com> wrote:
>>> Hi Shabab and Sandy,
>>> The thing is we have a 6 node cloudera cluster running.. For
>>> development purposes, i was building a map-reduce application on a
>>> single node apache distribution hadoop with maven..
>>>
>>> To be frank, i don't know how to deploy this application on a multi
>>> node cloudera cluster. I am fairly well versed with Multi Node Apache
>>> Hadoop Distribution.. So, how can i go forward?
>>>
>>> Thanks for all the help :)
>>>
>>> On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
>>>> Hi Pavan,
>>>>
>>>> Configuration properties generally aren't included in the jar itself unless you explicitly set them in your java code. Rather they're picked up from the mapred-site.xml file located in the Hadoop configuration directory on the host you're running your job from.
>>>>
>>>> Is there an issue you're coming up against when trying to run your job on a cluster?
>>>>
>>>> -Sandy
>>>>
>>>> (iphnoe tpying)
>>>>
>>>> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>> I'm currently using maven to build the jars necessary for my
>>>>> map-reduce program to run and it works for a single node cluster..
>>>>>
>>>>> For a multi node cluster, how do i specify my map-reduce program to
>>>>> ingest the cluster settings instead of localhost settings?
>>>>> I don't know how to specify this using maven to build my jar.
>>>>>
>>>>> I'm using the cdh distribution by the way..
>>>>> --
>>>>> Regards-
>>>>> Pavan
>>>
>>>
>>>
>>> --
>>> Regards-
>>> Pavan
>>
>>
>>
>> --
>> Regards-
>> Pavan



-- 
Regards-
Pavan

Re: Maven Cloudera Configuration problem

Posted by sa...@cloudera.com.

Nothing in your pom.xml should affect the configurations your job runs with.

Are you running your job from a node on the cluster? When you say localhost configurations, do you mean it's using the LocalJobRunner?

-sandy

(iphnoe tpying)

On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra <pa...@gmail.com> wrote:

> When i actually run the job on the multi node cluster, logs shows it
> uses localhost configurations which i don't want..
> 
> I just have a pom.xml which lists all the dependencies like standard
> hadoop, standard hbase, standard zookeeper etc., Should i remove these
> dependencies?
> 
> I want the cluster settings to apply in my map-reduce application..
> So, this is where i'm stuck at..
> 
> On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra <pa...@gmail.com> wrote:
>> Hi Shabab and Sandy,
>> The thing is we have a 6 node cloudera cluster running.. For
>> development purposes, i was building a map-reduce application on a
>> single node apache distribution hadoop with maven..
>> 
>> To be frank, i don't know how to deploy this application on a multi
>> node cloudera cluster. I am fairly well versed with Multi Node Apache
>> Hadoop Distribution.. So, how can i go forward?
>> 
>> Thanks for all the help :)
>> 
>> On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
>>> Hi Pavan,
>>> 
>>> Configuration properties generally aren't included in the jar itself unless you explicitly set them in your java code. Rather they're picked up from the mapred-site.xml file located in the Hadoop configuration directory on the host you're running your job from.
>>> 
>>> Is there an issue you're coming up against when trying to run your job on a cluster?
>>> 
>>> -Sandy
>>> 
>>> (iphnoe tpying)
>>> 
>>> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com> wrote:
>>> 
>>>> Hi,
>>>> I'm currently using maven to build the jars necessary for my
>>>> map-reduce program to run and it works for a single node cluster..
>>>> 
>>>> For a multi node cluster, how do i specify my map-reduce program to
>>>> ingest the cluster settings instead of localhost settings?
>>>> I don't know how to specify this using maven to build my jar.
>>>> 
>>>> I'm using the cdh distribution by the way..
>>>> --
>>>> Regards-
>>>> Pavan
>> 
>> 
>> 
>> --
>> Regards-
>> Pavan
> 
> 
> 
> -- 
> Regards-
> Pavan

Re: Maven Cloudera Configuration problem

Posted by sa...@cloudera.com.

Nothing in your pom.xml should affect the configurations your job runs with.

Are you running your job from a node on the cluster? When you say localhost configurations, do you mean it's using the LocalJobRunner?

-sandy

(iphnoe tpying)

On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra <pa...@gmail.com> wrote:

> When i actually run the job on the multi node cluster, logs shows it
> uses localhost configurations which i don't want..
> 
> I just have a pom.xml which lists all the dependencies like standard
> hadoop, standard hbase, standard zookeeper etc., Should i remove these
> dependencies?
> 
> I want the cluster settings to apply in my map-reduce application..
> So, this is where i'm stuck at..
> 
> On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra <pa...@gmail.com> wrote:
>> Hi Shabab and Sandy,
>> The thing is we have a 6 node cloudera cluster running.. For
>> development purposes, i was building a map-reduce application on a
>> single node apache distribution hadoop with maven..
>> 
>> To be frank, i don't know how to deploy this application on a multi
>> node cloudera cluster. I am fairly well versed with Multi Node Apache
>> Hadoop Distribution.. So, how can i go forward?
>> 
>> Thanks for all the help :)
>> 
>> On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
>>> Hi Pavan,
>>> 
>>> Configuration properties generally aren't included in the jar itself unless you explicitly set them in your java code. Rather they're picked up from the mapred-site.xml file located in the Hadoop configuration directory on the host you're running your job from.
>>> 
>>> Is there an issue you're coming up against when trying to run your job on a cluster?
>>> 
>>> -Sandy
>>> 
>>> (iphnoe tpying)
>>> 
>>> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com> wrote:
>>> 
>>>> Hi,
>>>> I'm currently using maven to build the jars necessary for my
>>>> map-reduce program to run and it works for a single node cluster..
>>>> 
>>>> For a multi node cluster, how do i specify my map-reduce program to
>>>> ingest the cluster settings instead of localhost settings?
>>>> I don't know how to specify this using maven to build my jar.
>>>> 
>>>> I'm using the cdh distribution by the way..
>>>> --
>>>> Regards-
>>>> Pavan
>> 
>> 
>> 
>> --
>> Regards-
>> Pavan
> 
> 
> 
> -- 
> Regards-
> Pavan

Re: Maven Cloudera Configuration problem

Posted by sa...@cloudera.com.

Nothing in your pom.xml should affect the configurations your job runs with.

Are you running your job from a node on the cluster? When you say localhost configurations, do you mean it's using the LocalJobRunner?

-sandy

(iphnoe tpying)

On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra <pa...@gmail.com> wrote:

> When i actually run the job on the multi node cluster, logs shows it
> uses localhost configurations which i don't want..
> 
> I just have a pom.xml which lists all the dependencies like standard
> hadoop, standard hbase, standard zookeeper etc., Should i remove these
> dependencies?
> 
> I want the cluster settings to apply in my map-reduce application..
> So, this is where i'm stuck at..
> 
> On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra <pa...@gmail.com> wrote:
>> Hi Shabab and Sandy,
>> The thing is we have a 6 node cloudera cluster running.. For
>> development purposes, i was building a map-reduce application on a
>> single node apache distribution hadoop with maven..
>> 
>> To be frank, i don't know how to deploy this application on a multi
>> node cloudera cluster. I am fairly well versed with Multi Node Apache
>> Hadoop Distribution.. So, how can i go forward?
>> 
>> Thanks for all the help :)
>> 
>> On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
>>> Hi Pavan,
>>> 
>>> Configuration properties generally aren't included in the jar itself unless you explicitly set them in your java code. Rather they're picked up from the mapred-site.xml file located in the Hadoop configuration directory on the host you're running your job from.
>>> 
>>> Is there an issue you're coming up against when trying to run your job on a cluster?
>>> 
>>> -Sandy
>>> 
>>> (iphnoe tpying)
>>> 
>>> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com> wrote:
>>> 
>>>> Hi,
>>>> I'm currently using maven to build the jars necessary for my
>>>> map-reduce program to run and it works for a single node cluster..
>>>> 
>>>> For a multi node cluster, how do i specify my map-reduce program to
>>>> ingest the cluster settings instead of localhost settings?
>>>> I don't know how to specify this using maven to build my jar.
>>>> 
>>>> I'm using the cdh distribution by the way..
>>>> --
>>>> Regards-
>>>> Pavan
>> 
>> 
>> 
>> --
>> Regards-
>> Pavan
> 
> 
> 
> -- 
> Regards-
> Pavan

Re: Maven Cloudera Configuration problem

Posted by sa...@cloudera.com.

Nothing in your pom.xml should affect the configurations your job runs with.

Are you running your job from a node on the cluster? When you say localhost configurations, do you mean it's using the LocalJobRunner?

-sandy

(iphnoe tpying)

On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra <pa...@gmail.com> wrote:

> When i actually run the job on the multi node cluster, logs shows it
> uses localhost configurations which i don't want..
> 
> I just have a pom.xml which lists all the dependencies like standard
> hadoop, standard hbase, standard zookeeper etc., Should i remove these
> dependencies?
> 
> I want the cluster settings to apply in my map-reduce application..
> So, this is where i'm stuck at..
> 
> On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra <pa...@gmail.com> wrote:
>> Hi Shabab and Sandy,
>> The thing is we have a 6 node cloudera cluster running.. For
>> development purposes, i was building a map-reduce application on a
>> single node apache distribution hadoop with maven..
>> 
>> To be frank, i don't know how to deploy this application on a multi
>> node cloudera cluster. I am fairly well versed with Multi Node Apache
>> Hadoop Distribution.. So, how can i go forward?
>> 
>> Thanks for all the help :)
>> 
>> On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
>>> Hi Pavan,
>>> 
>>> Configuration properties generally aren't included in the jar itself unless you explicitly set them in your java code. Rather they're picked up from the mapred-site.xml file located in the Hadoop configuration directory on the host you're running your job from.
>>> 
>>> Is there an issue you're coming up against when trying to run your job on a cluster?
>>> 
>>> -Sandy
>>> 
>>> (iphnoe tpying)
>>> 
>>> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com> wrote:
>>> 
>>>> Hi,
>>>> I'm currently using maven to build the jars necessary for my
>>>> map-reduce program to run and it works for a single node cluster..
>>>> 
>>>> For a multi node cluster, how do i specify my map-reduce program to
>>>> ingest the cluster settings instead of localhost settings?
>>>> I don't know how to specify this using maven to build my jar.
>>>> 
>>>> I'm using the cdh distribution by the way..
>>>> --
>>>> Regards-
>>>> Pavan
>> 
>> 
>> 
>> --
>> Regards-
>> Pavan
> 
> 
> 
> -- 
> Regards-
> Pavan

Re: Maven Cloudera Configuration problem

Posted by Brad Cox <br...@gmail.com>.

I've been stuck on the same question lately so don't take this as definitive, just my best guess at what's required.

Using maven as your hadoop source is going to give you a "vanilla" hadoop; one that runs on localhost. You need one that you've customized to point to your remote cluster and you can't get that via maven. 

So my *GUESS* is you need to do a plain local install of hadoop and point HADOOP_HOME at that. Customize as required, then convince eclipse to use that instead of going thru maven (i.e. remove hadoop from the dependency list).

Everyone; is this on the right path? Anyone know of exact instructions?

On Aug 13, 2013, at 12:07 PM, Pavan Sudheendra <pa...@gmail.com> wrote:

> When i actually run the job on the multi node cluster, logs shows it
> uses localhost configurations which i don't want..
> 
> I just have a pom.xml which lists all the dependencies like standard
> hadoop, standard hbase, standard zookeeper etc., Should i remove these
> dependencies?
> 
> I want the cluster settings to apply in my map-reduce application..
> So, this is where i'm stuck at..
> 
> On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra <pa...@gmail.com> wrote:
>> Hi Shabab and Sandy,
>> The thing is we have a 6 node cloudera cluster running.. For
>> development purposes, i was building a map-reduce application on a
>> single node apache distribution hadoop with maven..
>> 
>> To be frank, i don't know how to deploy this application on a multi
>> node cloudera cluster. I am fairly well versed with Multi Node Apache
>> Hadoop Distribution.. So, how can i go forward?
>> 
>> Thanks for all the help :)
>> 
>> On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
>>> Hi Pavan,
>>> 
>>> Configuration properties generally aren't included in the jar itself unless you explicitly set them in your java code. Rather they're picked up from the mapred-site.xml file located in the Hadoop configuration directory on the host you're running your job from.
>>> 
>>> Is there an issue you're coming up against when trying to run your job on a cluster?
>>> 
>>> -Sandy
>>> 
>>> (iphnoe tpying)
>>> 
>>> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com> wrote:
>>> 
>>>> Hi,
>>>> I'm currently using maven to build the jars necessary for my
>>>> map-reduce program to run and it works for a single node cluster..
>>>> 
>>>> For a multi node cluster, how do i specify my map-reduce program to
>>>> ingest the cluster settings instead of localhost settings?
>>>> I don't know how to specify this using maven to build my jar.
>>>> 
>>>> I'm using the cdh distribution by the way..
>>>> --
>>>> Regards-
>>>> Pavan
>> 
>> 
>> 
>> --
>> Regards-
>> Pavan
> 
> 
> 
> -- 
> Regards-
> Pavan

Dr. Brad J. Cox    Cell: 703-594-1883 Blog: http://bradjcox.blogspot.com http://virtualschool.edu

Re: Maven Cloudera Configuration problem

Posted by Brad Cox <br...@gmail.com>.

I've been stuck on the same question lately so don't take this as definitive, just my best guess at what's required.

Using maven as your hadoop source is going to give you a "vanilla" hadoop; one that runs on localhost. You need one that you've customized to point to your remote cluster and you can't get that via maven. 

So my *GUESS* is you need to do a plain local install of hadoop and point HADOOP_HOME at that. Customize as required, then convince eclipse to use that instead of going thru maven (i.e. remove hadoop from the dependency list).

Everyone; is this on the right path? Anyone know of exact instructions?

On Aug 13, 2013, at 12:07 PM, Pavan Sudheendra <pa...@gmail.com> wrote:

> When i actually run the job on the multi node cluster, logs shows it
> uses localhost configurations which i don't want..
> 
> I just have a pom.xml which lists all the dependencies like standard
> hadoop, standard hbase, standard zookeeper etc., Should i remove these
> dependencies?
> 
> I want the cluster settings to apply in my map-reduce application..
> So, this is where i'm stuck at..
> 
> On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra <pa...@gmail.com> wrote:
>> Hi Shabab and Sandy,
>> The thing is we have a 6 node cloudera cluster running.. For
>> development purposes, i was building a map-reduce application on a
>> single node apache distribution hadoop with maven..
>> 
>> To be frank, i don't know how to deploy this application on a multi
>> node cloudera cluster. I am fairly well versed with Multi Node Apache
>> Hadoop Distribution.. So, how can i go forward?
>> 
>> Thanks for all the help :)
>> 
>> On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
>>> Hi Pavan,
>>> 
>>> Configuration properties generally aren't included in the jar itself unless you explicitly set them in your java code. Rather they're picked up from the mapred-site.xml file located in the Hadoop configuration directory on the host you're running your job from.
>>> 
>>> Is there an issue you're coming up against when trying to run your job on a cluster?
>>> 
>>> -Sandy
>>> 
>>> (iphnoe tpying)
>>> 
>>> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com> wrote:
>>> 
>>>> Hi,
>>>> I'm currently using maven to build the jars necessary for my
>>>> map-reduce program to run and it works for a single node cluster..
>>>> 
>>>> For a multi node cluster, how do i specify my map-reduce program to
>>>> ingest the cluster settings instead of localhost settings?
>>>> I don't know how to specify this using maven to build my jar.
>>>> 
>>>> I'm using the cdh distribution by the way..
>>>> --
>>>> Regards-
>>>> Pavan
>> 
>> 
>> 
>> --
>> Regards-
>> Pavan
> 
> 
> 
> -- 
> Regards-
> Pavan

Dr. Brad J. Cox    Cell: 703-594-1883 Blog: http://bradjcox.blogspot.com http://virtualschool.edu

Re: Maven Cloudera Configuration problem

Posted by Pavan Sudheendra <pa...@gmail.com>.

When i actually run the job on the multi node cluster, logs shows it
uses localhost configurations which i don't want..

I just have a pom.xml which lists all the dependencies like standard
hadoop, standard hbase, standard zookeeper etc., Should i remove these
dependencies?

I want the cluster settings to apply in my map-reduce application..
So, this is where i'm stuck at..

On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra <pa...@gmail.com> wrote:
> Hi Shabab and Sandy,
> The thing is we have a 6 node cloudera cluster running.. For
> development purposes, i was building a map-reduce application on a
> single node apache distribution hadoop with maven..
>
> To be frank, i don't know how to deploy this application on a multi
> node cloudera cluster. I am fairly well versed with Multi Node Apache
> Hadoop Distribution.. So, how can i go forward?
>
> Thanks for all the help :)
>
> On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
>> Hi Pavan,
>>
>> Configuration properties generally aren't included in the jar itself unless you explicitly set them in your java code. Rather they're picked up from the mapred-site.xml file located in the Hadoop configuration directory on the host you're running your job from.
>>
>> Is there an issue you're coming up against when trying to run your job on a cluster?
>>
>> -Sandy
>>
>> (iphnoe tpying)
>>
>> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com> wrote:
>>
>>> Hi,
>>> I'm currently using maven to build the jars necessary for my
>>> map-reduce program to run and it works for a single node cluster..
>>>
>>> For a multi node cluster, how do i specify my map-reduce program to
>>> ingest the cluster settings instead of localhost settings?
>>> I don't know how to specify this using maven to build my jar.
>>>
>>> I'm using the cdh distribution by the way..
>>> --
>>> Regards-
>>> Pavan
>
>
>
> --
> Regards-
> Pavan



-- 
Regards-
Pavan

Re: Maven Cloudera Configuration problem

Posted by Pavan Sudheendra <pa...@gmail.com>.

When i actually run the job on the multi node cluster, logs shows it
uses localhost configurations which i don't want..

I just have a pom.xml which lists all the dependencies like standard
hadoop, standard hbase, standard zookeeper etc., Should i remove these
dependencies?

I want the cluster settings to apply in my map-reduce application..
So, this is where i'm stuck at..

On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra <pa...@gmail.com> wrote:
> Hi Shabab and Sandy,
> The thing is we have a 6 node cloudera cluster running.. For
> development purposes, i was building a map-reduce application on a
> single node apache distribution hadoop with maven..
>
> To be frank, i don't know how to deploy this application on a multi
> node cloudera cluster. I am fairly well versed with Multi Node Apache
> Hadoop Distribution.. So, how can i go forward?
>
> Thanks for all the help :)
>
> On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
>> Hi Pavan,
>>
>> Configuration properties generally aren't included in the jar itself unless you explicitly set them in your java code. Rather they're picked up from the mapred-site.xml file located in the Hadoop configuration directory on the host you're running your job from.
>>
>> Is there an issue you're coming up against when trying to run your job on a cluster?
>>
>> -Sandy
>>
>> (iphnoe tpying)
>>
>> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com> wrote:
>>
>>> Hi,
>>> I'm currently using maven to build the jars necessary for my
>>> map-reduce program to run and it works for a single node cluster..
>>>
>>> For a multi node cluster, how do i specify my map-reduce program to
>>> ingest the cluster settings instead of localhost settings?
>>> I don't know how to specify this using maven to build my jar.
>>>
>>> I'm using the cdh distribution by the way..
>>> --
>>> Regards-
>>> Pavan
>
>
>
> --
> Regards-
> Pavan



-- 
Regards-
Pavan

Re: Maven Cloudera Configuration problem

Posted by Pavan Sudheendra <pa...@gmail.com>.

When i actually run the job on the multi node cluster, logs shows it
uses localhost configurations which i don't want..

I just have a pom.xml which lists all the dependencies like standard
hadoop, standard hbase, standard zookeeper etc., Should i remove these
dependencies?

I want the cluster settings to apply in my map-reduce application..
So, this is where i'm stuck at..

On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra <pa...@gmail.com> wrote:
> Hi Shabab and Sandy,
> The thing is we have a 6 node cloudera cluster running.. For
> development purposes, i was building a map-reduce application on a
> single node apache distribution hadoop with maven..
>
> To be frank, i don't know how to deploy this application on a multi
> node cloudera cluster. I am fairly well versed with Multi Node Apache
> Hadoop Distribution.. So, how can i go forward?
>
> Thanks for all the help :)
>
> On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
>> Hi Pavan,
>>
>> Configuration properties generally aren't included in the jar itself unless you explicitly set them in your java code. Rather they're picked up from the mapred-site.xml file located in the Hadoop configuration directory on the host you're running your job from.
>>
>> Is there an issue you're coming up against when trying to run your job on a cluster?
>>
>> -Sandy
>>
>> (iphnoe tpying)
>>
>> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com> wrote:
>>
>>> Hi,
>>> I'm currently using maven to build the jars necessary for my
>>> map-reduce program to run and it works for a single node cluster..
>>>
>>> For a multi node cluster, how do i specify my map-reduce program to
>>> ingest the cluster settings instead of localhost settings?
>>> I don't know how to specify this using maven to build my jar.
>>>
>>> I'm using the cdh distribution by the way..
>>> --
>>> Regards-
>>> Pavan
>
>
>
> --
> Regards-
> Pavan



-- 
Regards-
Pavan

Re: Maven Cloudera Configuration problem

Posted by Pavan Sudheendra <pa...@gmail.com>.

When i actually run the job on the multi node cluster, logs shows it
uses localhost configurations which i don't want..

I just have a pom.xml which lists all the dependencies like standard
hadoop, standard hbase, standard zookeeper etc., Should i remove these
dependencies?

I want the cluster settings to apply in my map-reduce application..
So, this is where i'm stuck at..

On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra <pa...@gmail.com> wrote:
> Hi Shabab and Sandy,
> The thing is we have a 6 node cloudera cluster running.. For
> development purposes, i was building a map-reduce application on a
> single node apache distribution hadoop with maven..
>
> To be frank, i don't know how to deploy this application on a multi
> node cloudera cluster. I am fairly well versed with Multi Node Apache
> Hadoop Distribution.. So, how can i go forward?
>
> Thanks for all the help :)
>
> On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
>> Hi Pavan,
>>
>> Configuration properties generally aren't included in the jar itself unless you explicitly set them in your java code. Rather they're picked up from the mapred-site.xml file located in the Hadoop configuration directory on the host you're running your job from.
>>
>> Is there an issue you're coming up against when trying to run your job on a cluster?
>>
>> -Sandy
>>
>> (iphnoe tpying)
>>
>> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com> wrote:
>>
>>> Hi,
>>> I'm currently using maven to build the jars necessary for my
>>> map-reduce program to run and it works for a single node cluster..
>>>
>>> For a multi node cluster, how do i specify my map-reduce program to
>>> ingest the cluster settings instead of localhost settings?
>>> I don't know how to specify this using maven to build my jar.
>>>
>>> I'm using the cdh distribution by the way..
>>> --
>>> Regards-
>>> Pavan
>
>
>
> --
> Regards-
> Pavan



-- 
Regards-
Pavan

Re: Maven Cloudera Configuration problem

Posted by Pavan Sudheendra <pa...@gmail.com>.

Hi Shabab and Sandy,
The thing is we have a 6 node cloudera cluster running.. For
development purposes, i was building a map-reduce application on a
single node apache distribution hadoop with maven..

To be frank, i don't know how to deploy this application on a multi
node cloudera cluster. I am fairly well versed with Multi Node Apache
Hadoop Distribution.. So, how can i go forward?

Thanks for all the help :)

On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
> Hi Pavan,
>
> Configuration properties generally aren't included in the jar itself unless you explicitly set them in your java code. Rather they're picked up from the mapred-site.xml file located in the Hadoop configuration directory on the host you're running your job from.
>
> Is there an issue you're coming up against when trying to run your job on a cluster?
>
> -Sandy
>
> (iphnoe tpying)
>
> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com> wrote:
>
>> Hi,
>> I'm currently using maven to build the jars necessary for my
>> map-reduce program to run and it works for a single node cluster..
>>
>> For a multi node cluster, how do i specify my map-reduce program to
>> ingest the cluster settings instead of localhost settings?
>> I don't know how to specify this using maven to build my jar.
>>
>> I'm using the cdh distribution by the way..
>> --
>> Regards-
>> Pavan



-- 
Regards-
Pavan

Re: Maven Cloudera Configuration problem

Posted by Pavan Sudheendra <pa...@gmail.com>.

Hi Shabab and Sandy,
The thing is we have a 6 node cloudera cluster running.. For
development purposes, i was building a map-reduce application on a
single node apache distribution hadoop with maven..

To be frank, i don't know how to deploy this application on a multi
node cloudera cluster. I am fairly well versed with Multi Node Apache
Hadoop Distribution.. So, how can i go forward?

Thanks for all the help :)

On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
> Hi Pavan,
>
> Configuration properties generally aren't included in the jar itself unless you explicitly set them in your java code. Rather they're picked up from the mapred-site.xml file located in the Hadoop configuration directory on the host you're running your job from.
>
> Is there an issue you're coming up against when trying to run your job on a cluster?
>
> -Sandy
>
> (iphnoe tpying)
>
> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com> wrote:
>
>> Hi,
>> I'm currently using maven to build the jars necessary for my
>> map-reduce program to run and it works for a single node cluster..
>>
>> For a multi node cluster, how do i specify my map-reduce program to
>> ingest the cluster settings instead of localhost settings?
>> I don't know how to specify this using maven to build my jar.
>>
>> I'm using the cdh distribution by the way..
>> --
>> Regards-
>> Pavan



-- 
Regards-
Pavan

Re: Maven Cloudera Configuration problem

Posted by Pavan Sudheendra <pa...@gmail.com>.

Hi Shabab and Sandy,
The thing is we have a 6 node cloudera cluster running.. For
development purposes, i was building a map-reduce application on a
single node apache distribution hadoop with maven..

To be frank, i don't know how to deploy this application on a multi
node cloudera cluster. I am fairly well versed with Multi Node Apache
Hadoop Distribution.. So, how can i go forward?

Thanks for all the help :)

On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
> Hi Pavan,
>
> Configuration properties generally aren't included in the jar itself unless you explicitly set them in your java code. Rather they're picked up from the mapred-site.xml file located in the Hadoop configuration directory on the host you're running your job from.
>
> Is there an issue you're coming up against when trying to run your job on a cluster?
>
> -Sandy
>
> (iphnoe tpying)
>
> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com> wrote:
>
>> Hi,
>> I'm currently using maven to build the jars necessary for my
>> map-reduce program to run and it works for a single node cluster..
>>
>> For a multi node cluster, how do i specify my map-reduce program to
>> ingest the cluster settings instead of localhost settings?
>> I don't know how to specify this using maven to build my jar.
>>
>> I'm using the cdh distribution by the way..
>> --
>> Regards-
>> Pavan



-- 
Regards-
Pavan

Re: Maven Cloudera Configuration problem

Posted by Pavan Sudheendra <pa...@gmail.com>.

Hi Shabab and Sandy,
The thing is we have a 6 node cloudera cluster running.. For
development purposes, i was building a map-reduce application on a
single node apache distribution hadoop with maven..

To be frank, i don't know how to deploy this application on a multi
node cloudera cluster. I am fairly well versed with Multi Node Apache
Hadoop Distribution.. So, how can i go forward?

Thanks for all the help :)

On Tue, Aug 13, 2013 at 9:22 PM,  <sa...@cloudera.com> wrote:
> Hi Pavan,
>
> Configuration properties generally aren't included in the jar itself unless you explicitly set them in your java code. Rather they're picked up from the mapred-site.xml file located in the Hadoop configuration directory on the host you're running your job from.
>
> Is there an issue you're coming up against when trying to run your job on a cluster?
>
> -Sandy
>
> (iphnoe tpying)
>
> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com> wrote:
>
>> Hi,
>> I'm currently using maven to build the jars necessary for my
>> map-reduce program to run and it works for a single node cluster..
>>
>> For a multi node cluster, how do i specify my map-reduce program to
>> ingest the cluster settings instead of localhost settings?
>> I don't know how to specify this using maven to build my jar.
>>
>> I'm using the cdh distribution by the way..
>> --
>> Regards-
>> Pavan



-- 
Regards-
Pavan

Re: Maven Cloudera Configuration problem

Posted by sa...@cloudera.com.

Hi Pavan,

Configuration properties generally aren't included in the jar itself unless you explicitly set them in your java code. Rather they're picked up from the mapred-site.xml file located in the Hadoop configuration directory on the host you're running your job from.

Is there an issue you're coming up against when trying to run your job on a cluster?

-Sandy

(iphnoe tpying)

On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com> wrote:

> Hi,
> I'm currently using maven to build the jars necessary for my
> map-reduce program to run and it works for a single node cluster..
> 
> For a multi node cluster, how do i specify my map-reduce program to
> ingest the cluster settings instead of localhost settings?
> I don't know how to specify this using maven to build my jar.
> 
> I'm using the cdh distribution by the way..
> -- 
> Regards-
> Pavan

Re: Maven Cloudera Configuration problem

Posted by Raj K Singh <ra...@gmail.com>.

there is no way to specify conf settings in the maven pom.xml,instead you
can build your project based on the profile and specify the properties into
up property file.

for setting the conf properties its better to create a shell script to run
your jar, in which you need to provide the conf parameters.

::::::::::::::::::::::::::::::::::::::::
Raj K Singh
http://www.rajkrrsingh.blogspot.com
Mobile  Tel: +91 (0)9899821370

On Tue, Aug 13, 2013 at 4:49 PM, Pavan Sudheendra <pa...@gmail.com>wrote:

> Hi,
> I'm currently using maven to build the jars necessary for my
> map-reduce program to run and it works for a single node cluster..
>
> For a multi node cluster, how do i specify my map-reduce program to
> ingest the cluster settings instead of localhost settings?
> I don't know how to specify this using maven to build my jar.
>
> I'm using the cdh distribution by the way..
> --
> Regards-
> Pavan
>

Re: Maven Cloudera Configuration problem

Posted by Raj K Singh <ra...@gmail.com>.

there is no way to specify conf settings in the maven pom.xml,instead you
can build your project based on the profile and specify the properties into
up property file.

for setting the conf properties its better to create a shell script to run
your jar, in which you need to provide the conf parameters.

::::::::::::::::::::::::::::::::::::::::
Raj K Singh
http://www.rajkrrsingh.blogspot.com
Mobile  Tel: +91 (0)9899821370

On Tue, Aug 13, 2013 at 4:49 PM, Pavan Sudheendra <pa...@gmail.com>wrote:

> Hi,
> I'm currently using maven to build the jars necessary for my
> map-reduce program to run and it works for a single node cluster..
>
> For a multi node cluster, how do i specify my map-reduce program to
> ingest the cluster settings instead of localhost settings?
> I don't know how to specify this using maven to build my jar.
>
> I'm using the cdh distribution by the way..
> --
> Regards-
> Pavan
>

Re: Maven Cloudera Configuration problem

Posted by Raj K Singh <ra...@gmail.com>.

there is no way to specify conf settings in the maven pom.xml,instead you
can build your project based on the profile and specify the properties into
up property file.

for setting the conf properties its better to create a shell script to run
your jar, in which you need to provide the conf parameters.

::::::::::::::::::::::::::::::::::::::::
Raj K Singh
http://www.rajkrrsingh.blogspot.com
Mobile  Tel: +91 (0)9899821370

On Tue, Aug 13, 2013 at 4:49 PM, Pavan Sudheendra <pa...@gmail.com>wrote:

> Hi,
> I'm currently using maven to build the jars necessary for my
> map-reduce program to run and it works for a single node cluster..
>
> For a multi node cluster, how do i specify my map-reduce program to
> ingest the cluster settings instead of localhost settings?
> I don't know how to specify this using maven to build my jar.
>
> I'm using the cdh distribution by the way..
> --
> Regards-
> Pavan
>

Re: Maven Cloudera Configuration problem

Posted by sa...@cloudera.com.

Hi Pavan,

Configuration properties generally aren't included in the jar itself unless you explicitly set them in your java code. Rather they're picked up from the mapred-site.xml file located in the Hadoop configuration directory on the host you're running your job from.

Is there an issue you're coming up against when trying to run your job on a cluster?

-Sandy

(iphnoe tpying)

On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com> wrote:

> Hi,
> I'm currently using maven to build the jars necessary for my
> map-reduce program to run and it works for a single node cluster..
> 
> For a multi node cluster, how do i specify my map-reduce program to
> ingest the cluster settings instead of localhost settings?
> I don't know how to specify this using maven to build my jar.
> 
> I'm using the cdh distribution by the way..
> -- 
> Regards-
> Pavan

Re: Maven Cloudera Configuration problem

Posted by Shahab Yunus <sh...@gmail.com>.

You need to configure your namenode and jobtracker information in the
configuration files within you application. Only set the relevant
properties in the copy of the files that you are bundling in your job. For
rest the default values would be used from the default configuration files
(core-default.xml, mapred-default.xml) already bundled in the lib/jar
provided by cloudera/hadoop. The assumption is that this is for MRv1.

Anyway, you should go through this for details
http://hadoop.apache.org/docs/stable/cluster_setup.html

*core-site.xml *(teh security ones are optional and if you are not using
anything special you can remove them and rely on the defaults which is also
'simple'.

<configuration>

  <property>

    <name>fs.defaultFS</name>

    <value>hdfs://server:8020</value>

  </property>

  <property>

    <name>hadoop.security.authentication</name>

    <value>simple</value>

  </property>

  <property>

    <name>hadoop.security.auth_to_local</name>

    <value>DEFAULT</value>

  </property>

</configuration>

*map-red.xml*

</configuration>

  <property>

    <name>mapred.job.tracker</name>

    <value>http://server:</value>

  </property>

</configuration>*
*

Regards,
Shahab

*
*

On Tue, Aug 13, 2013 at 7:19 AM, Pavan Sudheendra <pa...@gmail.com>wrote:

> Hi,
> I'm currently using maven to build the jars necessary for my
> map-reduce program to run and it works for a single node cluster..
>
> For a multi node cluster, how do i specify my map-reduce program to
> ingest the cluster settings instead of localhost settings?
> I don't know how to specify this using maven to build my jar.
>
> I'm using the cdh distribution by the way..
> --
> Regards-
> Pavan
>

Re: Maven Cloudera Configuration problem

Posted by sa...@cloudera.com.

Hi Pavan,

Configuration properties generally aren't included in the jar itself unless you explicitly set them in your java code. Rather they're picked up from the mapred-site.xml file located in the Hadoop configuration directory on the host you're running your job from.

Is there an issue you're coming up against when trying to run your job on a cluster?

-Sandy

(iphnoe tpying)

On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com> wrote:

> Hi,
> I'm currently using maven to build the jars necessary for my
> map-reduce program to run and it works for a single node cluster..
> 
> For a multi node cluster, how do i specify my map-reduce program to
> ingest the cluster settings instead of localhost settings?
> I don't know how to specify this using maven to build my jar.
> 
> I'm using the cdh distribution by the way..
> -- 
> Regards-
> Pavan

Re: Maven Cloudera Configuration problem

Posted by Shahab Yunus <sh...@gmail.com>.

You need to configure your namenode and jobtracker information in the
configuration files within you application. Only set the relevant
properties in the copy of the files that you are bundling in your job. For
rest the default values would be used from the default configuration files
(core-default.xml, mapred-default.xml) already bundled in the lib/jar
provided by cloudera/hadoop. The assumption is that this is for MRv1.

Anyway, you should go through this for details
http://hadoop.apache.org/docs/stable/cluster_setup.html

*core-site.xml *(teh security ones are optional and if you are not using
anything special you can remove them and rely on the defaults which is also
'simple'.

<configuration>

  <property>

    <name>fs.defaultFS</name>

    <value>hdfs://server:8020</value>

  </property>

  <property>

    <name>hadoop.security.authentication</name>

    <value>simple</value>

  </property>

  <property>

    <name>hadoop.security.auth_to_local</name>

    <value>DEFAULT</value>

  </property>

</configuration>

*map-red.xml*

</configuration>

  <property>

    <name>mapred.job.tracker</name>

    <value>http://server:</value>

  </property>

</configuration>*
*

Regards,
Shahab

*
*

On Tue, Aug 13, 2013 at 7:19 AM, Pavan Sudheendra <pa...@gmail.com>wrote:

> Hi,
> I'm currently using maven to build the jars necessary for my
> map-reduce program to run and it works for a single node cluster..
>
> For a multi node cluster, how do i specify my map-reduce program to
> ingest the cluster settings instead of localhost settings?
> I don't know how to specify this using maven to build my jar.
>
> I'm using the cdh distribution by the way..
> --
> Regards-
> Pavan
>

Re: Maven Cloudera Configuration problem

Posted by Raj K Singh <ra...@gmail.com>.

there is no way to specify conf settings in the maven pom.xml,instead you
can build your project based on the profile and specify the properties into
up property file.

for setting the conf properties its better to create a shell script to run
your jar, in which you need to provide the conf parameters.

::::::::::::::::::::::::::::::::::::::::
Raj K Singh
http://www.rajkrrsingh.blogspot.com
Mobile  Tel: +91 (0)9899821370

On Tue, Aug 13, 2013 at 4:49 PM, Pavan Sudheendra <pa...@gmail.com>wrote:

> Hi,
> I'm currently using maven to build the jars necessary for my
> map-reduce program to run and it works for a single node cluster..
>
> For a multi node cluster, how do i specify my map-reduce program to
> ingest the cluster settings instead of localhost settings?
> I don't know how to specify this using maven to build my jar.
>
> I'm using the cdh distribution by the way..
> --
> Regards-
> Pavan
>

Re: Maven Cloudera Configuration problem

Posted by sa...@cloudera.com.

Hi Pavan,

Configuration properties generally aren't included in the jar itself unless you explicitly set them in your java code. Rather they're picked up from the mapred-site.xml file located in the Hadoop configuration directory on the host you're running your job from.

Is there an issue you're coming up against when trying to run your job on a cluster?

-Sandy

(iphnoe tpying)

On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra <pa...@gmail.com> wrote:

> Hi,
> I'm currently using maven to build the jars necessary for my
> map-reduce program to run and it works for a single node cluster..
> 
> For a multi node cluster, how do i specify my map-reduce program to
> ingest the cluster settings instead of localhost settings?
> I don't know how to specify this using maven to build my jar.
> 
> I'm using the cdh distribution by the way..
> -- 
> Regards-
> Pavan

Re: Maven Cloudera Configuration problem

Posted by Shahab Yunus <sh...@gmail.com>.

You need to configure your namenode and jobtracker information in the
configuration files within you application. Only set the relevant
properties in the copy of the files that you are bundling in your job. For
rest the default values would be used from the default configuration files
(core-default.xml, mapred-default.xml) already bundled in the lib/jar
provided by cloudera/hadoop. The assumption is that this is for MRv1.

Anyway, you should go through this for details
http://hadoop.apache.org/docs/stable/cluster_setup.html

*core-site.xml *(teh security ones are optional and if you are not using
anything special you can remove them and rely on the defaults which is also
'simple'.

<configuration>

  <property>

    <name>fs.defaultFS</name>

    <value>hdfs://server:8020</value>

  </property>

  <property>

    <name>hadoop.security.authentication</name>

    <value>simple</value>

  </property>

  <property>

    <name>hadoop.security.auth_to_local</name>

    <value>DEFAULT</value>

  </property>

</configuration>

*map-red.xml*

</configuration>

  <property>

    <name>mapred.job.tracker</name>

    <value>http://server:</value>

  </property>

</configuration>*
*

Regards,
Shahab

*
*

On Tue, Aug 13, 2013 at 7:19 AM, Pavan Sudheendra <pa...@gmail.com>wrote:

> Hi,
> I'm currently using maven to build the jars necessary for my
> map-reduce program to run and it works for a single node cluster..
>
> For a multi node cluster, how do i specify my map-reduce program to
> ingest the cluster settings instead of localhost settings?
> I don't know how to specify this using maven to build my jar.
>
> I'm using the cdh distribution by the way..
> --
> Regards-
> Pavan
>