You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bigtop.apache.org by "Konstantin Boudnik (JIRA)" <ji...@apache.org> on 2014/02/02 20:44:08 UTC

[jira] [Commented] (BIGTOP-1200) Implement init-hcfs.sh to define filesystem semantics , and refactor init-hdfs.sh into a facade.

    [ https://issues.apache.org/jira/browse/BIGTOP-1200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889035#comment-13889035 ] 

Konstantin Boudnik commented on BIGTOP-1200:
--------------------------------------------

I would suggest to wait for a bit on the refactoring of init-hdfs.sh. Here's the reasoning:
  - {{init-hdfs.sh}} is very slow cause it requires a number of JVM startups via {{hadoop}} script
  - it is hard to use for a filesystem that has been partially initialed already (e.g. {{/tmp}} has been already created; in this case the script will fail)

If instead we can simply push forward with BIGTOP-952 which is pretty much unblocked as BIGTOP-1097 ready for commit. Having direct calls to DFS APIs is beneficial on many levels and provide a much more powerful programming environment for all sorts of improvements.

If the new functionality is so important, I think it can be achieved in a simpler way by extending {{init-hdfs.sh}} to either have command line argument or to read an env. var. for the super-user name. Current changes does seem to be too massive for such a small improvement, doesn't it?

> Implement init-hcfs.sh to define filesystem semantics , and refactor init-hdfs.sh into a facade.
> ------------------------------------------------------------------------------------------------
>
>                 Key: BIGTOP-1200
>                 URL: https://issues.apache.org/jira/browse/BIGTOP-1200
>             Project: Bigtop
>          Issue Type: Improvement
>            Reporter: jay vyas
>         Attachments: BIGTOP-1200.patch
>
>
> One of the really useful artifacts in bigtop is the init-hdfs.sh script.  It defines ecosystem semantics and expectations for hadoop clusters. 
> Other HCFS filesystems can leverage the logic in this script quite easily, if we decouple its implementation from being HDFS specific by specifying a "SUPERUSER" parameter to replace "hdfs".
> And yes we can still have the init-hdfs.sh convenience script : which just calls "init-hcfs.sh hdfs" .
> Initial tests in puppet VMs pass.  (attaching patch with this JIRA)
> {noformat}
> [root@vagrant bigtop-puppet]# ./init-hdfs.sh 
> + echo 'Now initializing the Distributed File System with root=HDFS'
> Now initializing the Distributed File System with root=HDFS
> + ./init-hcfs.sh hdfs
> + '[' 1 -ne 1 ']'
> + SUPER_USER=hdfs
> + echo 'Initializing the DFS with super user : hdfs'
> Initializing the DFS with super user : hdfs
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir -p /tmp'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chmod -R 1777 /tmp'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir -p /var'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir -p /var/log'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chmod -R 1775 /var/log'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chown yarn:mapred /var/log'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir -p /tmp/hadoop-yarn'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chown -R mapred:mapred /tmp/hadoop-yarn'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chmod -R 777 /tmp/hadoop-yarn'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir -p /var/log/hadoop-yarn/apps'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chmod -R 1777 /var/log/hadoop-yarn/apps'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chown yarn:mapred /var/log/hadoop-yarn/apps'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir -p /hbase'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chown hbase:hbase /hbase'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir -p /solr'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chown solr:solr /solr'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir -p /benchmarks'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chmod -R 777 /benchmarks'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir -p /user'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chmod 755 /user'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chown hdfs  /user'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir -p /user/history'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chown mapred:mapred /user/history'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chmod 755 /user/history'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir -p /user/jenkins'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chmod -R 777 /user/jenkins'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chown jenkins /user/jenkins'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir -p /user/hive'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chmod -R 777 /user/hive'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chown hive /user/hive'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir -p /user/root'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chmod -R 777 /user/root'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chown root /user/root'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir -p /user/hue'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chmod -R 777 /user/hue'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chown hue /user/hue'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir -p /user/sqoop'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chmod -R 777 /user/sqoop'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chown sqoop /user/sqoop'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir -p /user/oozie'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chmod -R 777 /user/oozie'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chown -R oozie /user/oozie'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir -p /user/oozie/share'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir -p /user/oozie/share/lib'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir -p /user/oozie/share/lib/hive'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir -p /user/oozie/share/lib/mapreduce-streaming'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir -p /user/oozie/share/lib/distcp'
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir -p /user/oozie/share/lib/pig'
> + ls '/usr/lib/hive/lib/*.jar'
> + ls /usr/lib/hadoop-mapreduce/hadoop-streaming-2.0.6-alpha.jar /usr/lib/hadoop-mapreduce/hadoop-streaming.jar
> + su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -put /usr/lib/hadoop-mapreduce/hadoop-streaming*.jar /user/oozie/share/lib/mapreduce-streaming'
> put: `/user/oozie/share/lib/mapreduce-streaming/hadoop-streaming-2.0.6-alpha.jar': File exists
> put: `/user/oozie/share/lib/mapreduce-streaming/hadoop-streaming.jar': File exists
> [root@vagrant bigtop-puppet]# 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)