You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@bigtop.apache.org by Steven Núñez <st...@illation.com> on 2013/12/06 10:02:43 UTC

Hue Using MR1?

Gents,

I’m starting an install from BigTop 0.70: HDFS, YARN and Hue, with the goal of building up from there to a minimal stack. In theory, this should be as simple as ‘yum install hadoop\* hue\*’; in practice this turns out to be surprisingly broken. For example, after the yum install hue is reporting a configuration error:

hadoop.mapred_clusters.default.hadoop_mapred_home Current value: /usr/lib/hadoop-0.20-mapreduce
Path does not exist on the filesystem.

Nowhere is this being set from the files that BigTop installed. Why is Hue looking for MR1 stuff?

Attempting to fix another misconfiguration reported by hue:

hadoop.hdfs_clusters.default.webhdfs_url Current value: None
Failed to access filesystem root

Is simple, however when restarting the daemons with: "for i in hadoop-hdfs-namenode hadoop-hdfs-datanode ; do service $i restart ; done", both daemons fail to restart (say, what happened to start-dfs.sh?). Extracts from the logs show the reasons:

* Datanode because: java.net.BindException: Problem binding to [0.0.0.0:50010] java.net.BindException: Address already in use
* Namenode because: java.io.IOException: Cannot lock storage /var/lib/hadoop-hdfs/cache/hdfs/dfs/name. The directory is already locked

So what I’ve done is a fresh installed, modified two config files, as described in the hue configuration section<http://cloudera.github.io/hue/docs-3.5.0/manual.html#usage>, and tried to restart the HDFS daemons and they’re failing. This probably isn’t what I’d call ‘working’.

I know that BigTop is at 0.70, but then again version numbers don’t mean much these days. Is this kind of error to be expected after a fresh install? When will it be safe to assume that a simple stack (HDFS, YARN, Hue, Oozie, Zookeeper) installation would be working ‘out of the box’ without loads of manual configuration?

Perhaps I’m missing the point (I’m definitely missing the documentation), but at this stage it seems BigTop’s primary advantage is to ensure that the collection of packages is version compatible. Would that be fair to say? I’m not unappreciative of the value there, but just want to set expectations with people on what they’re getting with BigTop in its current state. I recall some discussion of configuration before, and there were somewhat different opinions. One that I recall was (my words, summarizing my understanding):

BigTop doesn’t do any configuration, that’s all left to the individual packages; BigTop just places them on the filesystem in (somewhat, there’s still .cfg, .ini, .conf files) a consistent manner.

Would that be a fair statement?

Cheers,
- Steve

Re: Puppet manual

Posted by Roman Shaposhnik <ro...@shaposhnik.org>.

Hi Ivo!

On Tue, Jan 7, 2014 at 7:40 AM, Ivo Frankov <i....@googlemail.com> wrote:
> Dear All,
>
> Is there some step by step manual for puppet configuration on CentOS 6
>
> Following questions:
> a) how to install puppet ? is it already in bigtop distribution?

The puppet itself is best obtained from the puppetlabs own repo. E.g. for
CentOS 6.2/x64 you'll need this:
http://yum.puppetlabs.com/el/6.2/products/x86_64/

Personally, I'd recommend using Puppet 2.7

> b) some scrpits to start pseudo mode cluster for Hadoop , Spark, giraph,
> Solr, HBase, pig

I haven't tried using our puppet much for the pseudo distributed configuration,
but on a fully distributed cluster it seems to be working reasonably well. You
can try it in a pseudo distributed case and let us know. The entry point remains
the same:
     https://github.com/apache/bigtop/tree/master/bigtop-deploy/puppet

Thanks,
Roman.

Puppet manual

Posted by Ivo Frankov <i....@googlemail.com>.

Dear All,

Is there some step by step manual for puppet configuration on CentOS 6

Following questions:
a) how to install puppet ? is it already in bigtop distribution?
b) some scrpits to start pseudo mode cluster for Hadoop , Spark, giraph, 
Solr, HBase, pig

Best  regards.
Ivo
On 12/06/2013 05:43 PM, Roman Shaposhnik wrote:
> On Fri, Dec 6, 2013 at 4:27 AM, Steven Núñez <st...@illation.com> wrote:
>> Good question. I¹ve got to say the BitTop community earns top points for
>> being helpful and constructive.
> Thanks!
>
>> Specifically, I¹d say a working pseudo configuration for all components
>> would be a great start for newcomers. Second, I¹d look at puppet
>> integration, perhaps via Ambari or Hue. The update could prompt the user:
>> ³Would you like to configure your cluster now? <y/n>² and then start the
>> interface for that. If that¹s too ambitious for a weekend, then include a
>> number of working puppet recipes and install them, along with some
>> pointers to the alternatives system.
> A really great way to make sure ideas like this one don't get dropped is
> to file a JIRA. An even greater way is to attach a patch for said JIRA ;-)
> It doesn't have to be all perfect to begin with -- as Linus would say:
> attach early, attach often.
>
> Thanks,
> Roman.

Re: Hue Using MR1?

Posted by Roman Shaposhnik <ro...@shaposhnik.org>.

On Fri, Dec 6, 2013 at 4:27 AM, Steven Núñez <st...@illation.com> wrote:
> Good question. I¹ve got to say the BitTop community earns top points for
> being helpful and constructive.

Thanks!

> Specifically, I¹d say a working pseudo configuration for all components
> would be a great start for newcomers. Second, I¹d look at puppet
> integration, perhaps via Ambari or Hue. The update could prompt the user:
> ³Would you like to configure your cluster now? <y/n>² and then start the
> interface for that. If that¹s too ambitious for a weekend, then include a
> number of working puppet recipes and install them, along with some
> pointers to the alternatives system.

A really great way to make sure ideas like this one don't get dropped is
to file a JIRA. An even greater way is to attach a patch for said JIRA ;-)
It doesn't have to be all perfect to begin with -- as Linus would say:
attach early, attach often.

Thanks,
Roman.

Re: Hue Using MR1?

Posted by Steven Núñez <st...@illation.com>.

Good question. I¹ve got to say the BitTop community earns top points for
being helpful and constructive.

Specifically, I¹d say a working pseudo configuration for all components
would be a great start for newcomers. Second, I¹d look at puppet
integration, perhaps via Ambari or Hue. The update could prompt the user:
³Would you like to configure your cluster now? <y/n>² and then start the
interface for that. If that¹s too ambitious for a weekend, then include a
number of working puppet recipes and install them, along with some
pointers to the alternatives system.

Regards,
	- SteveN

On 2013-12-6 18:42 , "Bruno Mahé" <bm...@apache.org> wrote:

>Most devs on this list probably either use the puppet recipes or have a
>set of config files ready to be reused for their different installations.
>So we do not always encounter the same issues as you do from a fresh
>start.
>A hackathon is coming up this week end and I would like to tackle some
>tasks to make Apache Bigtop more user friendly. So besides some more
>consistent documentation and pseudo conf packages, would there be
>anything else you would like to see happening?

Re: Hue Using MR1?

Posted by Steven Núñez <st...@illation.com>.

I guess the last thing I would add is a bit of QA around the init scripts
and other installed items. They seem to be frayed around the edges.



On 2013-12-6 18:42 , "Bruno Mahé" <bm...@apache.org> wrote:

>See inline.
>
>
>On 12/06/2013 01:02 AM, Steven Núñez wrote:
>> Gents,
>>
>> I¹m starting an install from BigTop 0.70: HDFS, YARN and Hue, with the
>> goal of building up from there to a minimal stack. In theory, this
>> should be as simple as Œyum install hadoop\* hue\*¹; in practice this
>> turns out to be surprisingly broken. For example, after the yum install
>> hue is reporting a configuration error:
>>
>>     hadoop.mapred_clusters.default.hadoop_mapred_home Current value:
>>     /usr/lib/hadoop-0.20-mapreduce
>>     Path does not exist on the filesystem.
>>
>>
>> Nowhere is this being set from the files that BigTop installed. Why is
>> Hue looking for MR1 stuff?
>>
>> Attempting to fix another misconfiguration reported by hue:
>>
>>     hadoop.hdfs_clusters.default.webhdfs_url Current value: None
>>     Failed to access filesystem root
>>
>>
>
>Packages do not configure services for you. Not because it is broken,
>but because it is not their responsibility.
>In order to configure services correctly, packages would have to know
>about the cluster topology and probably a few more things.
>The closest thing you could find to that are the "pseudo conf"
>configurations which assume everything is run locally.
>In your example, you are just looking at the default Hue's configuration.
>If you want to deploy Apache Bigtop on multiple nodes, I would recommend
>you to take a look at the puppet recipes, which do configure services.
>Even if you do not use puppet, you can still look at the configuration
>templates to help you derive one for your cluster.
>
>
>Would that be helpful to you if more packages were to provide "pseudo
>conf" configurations where everything is assumed to run locally?
>
>
>
>> Is simple, however when restarting the daemons with: "for i in
>> hadoop-hdfs-namenode hadoop-hdfs-datanode ; do service $i restart ;
>> done", both daemons fail to restart (say, what happened to
>> start-dfs.sh?). Extracts from the logs show the reasons:
>>
>>   * Datanode because:java.net.BindException: Problem binding to
>>     [0.0.0.0:50010] java.net.BindException: Address already in use
>>   * Namenode because:java.io.IOException: Cannot lock storage
>>     /var/lib/hadoop-hdfs/cache/hdfs/dfs/name. The directory is already
>>     locked
>>
>> So what I¹ve done is a fresh installed, modified two config files, as
>> described in the hue configuration section
>> <http://cloudera.github.io/hue/docs-3.5.0/manual.html#usage>, and tried
>> to restart the HDFS daemons and they¹re failing. This probably isn¹t
>> what I¹d call Œworking¹.
>>
>
>start-dfs.sh is not meant to be used with packages. It would not work.
>
>Can you describe *exactly* what you did from the beginning along with
>the output?
>The easier it is to reproduce, the easier it will be to fix/help you :)
>
>
>
>> I know that BigTop is at 0.70, but then again version numbers don¹t mean
>> much these days. Is this kind of error to be expected after a fresh
>> install? When will it be safe to assume that a simple stack (HDFS, YARN,
>> Hue, Oozie, Zookeeper) installation would be working Œout of the box¹
>> without loads of manual configuration?
>>
>> Perhaps I¹m missing the point (I¹m definitely missing the
>> documentation), but at this stage it seems BigTop¹s primary advantage is
>> to ensure that the collection of packages is version compatible. Would
>> that be fair to say? I¹m not unappreciative of the value there, but just
>> want to set expectations with people on what they¹re getting with BigTop
>> in its current state. I recall some discussion of configuration before,
>> and there were somewhat different opinions. One that I recall was (my
>> words, summarizing my understanding):
>>
>>     BigTop doesn¹t do any configuration, that¹s all left to the
>>     individual packages; BigTop just places them on the filesystem in
>>     (somewhat, there¹s still .cfg, .ini, .conf files) a consistent
>>manner.
>>
>>
>> Would that be a fair statement?
>>
>> Cheers,
>> - Steve
>>
>>
>
>The packages do not set the configuration. But the Apache Bigtop puppet
>recipes will set the configuration.
>
>Packages will, as you describe, put files on the filesystem in a
>consistent manner but also create users, and other tasks to provide a
>rich integration with the system (init scripts, ulimit, etc.).
>
>Most devs on this list probably either use the puppet recipes or have a
>set of config files ready to be reused for their different installations.
>So we do not always encounter the same issues as you do from a fresh
>start.
>A hackathon is coming up this week end and I would like to tackle some
>tasks to make Apache Bigtop more user friendly. So besides some more
>consistent documentation and pseudo conf packages, would there be
>anything else you would like to see happening?
>
>
>Thanks,
>Bruno

Re: Hue Using MR1?

Posted by Bruno Mahé <bm...@apache.org>.

See inline.

On 12/06/2013 01:02 AM, Steven Núñez wrote:
> Gents,
>
> I’m starting an install from BigTop 0.70: HDFS, YARN and Hue, with the
> goal of building up from there to a minimal stack. In theory, this
> should be as simple as ‘yum install hadoop\* hue\*’; in practice this
> turns out to be surprisingly broken. For example, after the yum install
> hue is reporting a configuration error:
>
>     hadoop.mapred_clusters.default.hadoop_mapred_home Current value:
>     /usr/lib/hadoop-0.20-mapreduce
>     Path does not exist on the filesystem.
>
>
> Nowhere is this being set from the files that BigTop installed. Why is
> Hue looking for MR1 stuff?
>
> Attempting to fix another misconfiguration reported by hue:
>
>     hadoop.hdfs_clusters.default.webhdfs_url Current value: None
>     Failed to access filesystem root
>
>

Packages do not configure services for you. Not because it is broken, 
but because it is not their responsibility.
In order to configure services correctly, packages would have to know 
about the cluster topology and probably a few more things.
The closest thing you could find to that are the "pseudo conf" 
configurations which assume everything is run locally.
In your example, you are just looking at the default Hue's configuration.
If you want to deploy Apache Bigtop on multiple nodes, I would recommend 
you to take a look at the puppet recipes, which do configure services. 
Even if you do not use puppet, you can still look at the configuration 
templates to help you derive one for your cluster.

Would that be helpful to you if more packages were to provide "pseudo 
conf" configurations where everything is assumed to run locally?

> Is simple, however when restarting the daemons with: "for i in
> hadoop-hdfs-namenode hadoop-hdfs-datanode ; do service $i restart ;
> done", both daemons fail to restart (say, what happened to
> start-dfs.sh?). Extracts from the logs show the reasons:
>
>   * Datanode because:java.net.BindException: Problem binding to
>     [0.0.0.0:50010] java.net.BindException: Address already in use
>   * Namenode because:java.io.IOException: Cannot lock storage
>     /var/lib/hadoop-hdfs/cache/hdfs/dfs/name. The directory is already
>     locked
>
> So what I’ve done is a fresh installed, modified two config files, as
> described in the hue configuration section
> <http://cloudera.github.io/hue/docs-3.5.0/manual.html#usage>, and tried
> to restart the HDFS daemons and they’re failing. This probably isn’t
> what I’d call ‘working’.
>

start-dfs.sh is not meant to be used with packages. It would not work.

Can you describe *exactly* what you did from the beginning along with 
the output?
The easier it is to reproduce, the easier it will be to fix/help you :)

> I know that BigTop is at 0.70, but then again version numbers don’t mean
> much these days. Is this kind of error to be expected after a fresh
> install? When will it be safe to assume that a simple stack (HDFS, YARN,
> Hue, Oozie, Zookeeper) installation would be working ‘out of the box’
> without loads of manual configuration?
>
> Perhaps I’m missing the point (I’m definitely missing the
> documentation), but at this stage it seems BigTop’s primary advantage is
> to ensure that the collection of packages is version compatible. Would
> that be fair to say? I’m not unappreciative of the value there, but just
> want to set expectations with people on what they’re getting with BigTop
> in its current state. I recall some discussion of configuration before,
> and there were somewhat different opinions. One that I recall was (my
> words, summarizing my understanding):
>
>     BigTop doesn’t do any configuration, that’s all left to the
>     individual packages; BigTop just places them on the filesystem in
>     (somewhat, there’s still .cfg, .ini, .conf files) a consistent manner.
>
>
> Would that be a fair statement?
>
> Cheers,
> - Steve
>
>

The packages do not set the configuration. But the Apache Bigtop puppet 
recipes will set the configuration.

Packages will, as you describe, put files on the filesystem in a 
consistent manner but also create users, and other tasks to provide a 
rich integration with the system (init scripts, ulimit, etc.).

Most devs on this list probably either use the puppet recipes or have a 
set of config files ready to be reused for their different installations.
So we do not always encounter the same issues as you do from a fresh start.
A hackathon is coming up this week end and I would like to tackle some 
tasks to make Apache Bigtop more user friendly. So besides some more 
consistent documentation and pseudo conf packages, would there be 
anything else you would like to see happening?

Thanks,
Bruno