You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Allen Wittenauer (JIRA)" <ji...@apache.org> on 2016/02/22 02:45:18 UTC

[jira] [Comment Edited] (HADOOP-12830) Bash environment for quick command operations

    [ https://issues.apache.org/jira/browse/HADOOP-12830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15156363#comment-15156363 ] 

Allen Wittenauer edited comment on HADOOP-12830 at 2/22/16 1:45 AM:
--------------------------------------------------------------------

* Take a look at https://wiki.apache.org/hadoop/UnixShellScriptProgrammingGuide for some hints on how the .sh file should be written.  (e.g., HSH_ should be HADOOP_, use the various shell functions instead of duplicated code for stop/start/etc, declaring vars in the middle of the code, not declaring some vars at all,  ... ). 

* As you acknowledge, this won't work on anything but LInux.  So this should fail gracefully rather than spew errors all over the screen.

* This looks like it has a pretty massive security hole.  Anyone writing to the fifo (e.g., anyone with root) will be able to execute commands as the person who opened it.  To me, this is pretty much an instant -1.

* Use "$\{BASH_SOURCE-$0\}" coupled with a bash regex here to cut the extra fork and to work when executed directly with bash -x:

{code}
+# if this file is executed, start the shell
+if [[ $(basename $0) == "hadoop-shell.sh" ]]; then
{code}

* Instead of using "which", use "command" here:
{code}
+  if [[ -z $(which hadoop) ]]; then
{code}

* I don't think there is any guarantee that HADOOP_PREFIX has been defined at this point or even point to the correct hadoop command. (There are a lot of reasons why, too many to go into here.)

{code}
 +    export PATH=${HADOOP_PREFIX}/bin:${PATH}
{code}



was (Author: aw):
* Take a look at https://wiki.apache.org/hadoop/UnixShellScriptProgrammingGuide for some hints on how the .sh file should be written.  (e.g., HSH_ should be HADOOP_, use the various shell functions instead of duplicated code for stop/start/etc, declaring vars in the middle of the code, not declaring some vars at all,  ... ). 

* As you acknowledge, this won't work on anything but LInux.  So this should fail gracefully rather than spew errors all over the screen.

* This looks like it has a pretty massive security hole.  Anyone writing to the fifo (e.g., anyone with root) will be able to execute commands as the person who opened it.  To me, this is pretty much an instant -1.

* Use "${BASH_SOURCE-$0}" coupled with a bash regex here to cut the extra fork and to work when executed directly with bash -x:

{code}
+# if this file is executed, start the shell
+if [[ $(basename $0) == "hadoop-shell.sh" ]]; then
{code}

* Instead of using "which", use "command" here:
{code}
+  if [[ -z $(which hadoop) ]]; then
{code}

* I don't think there is any guarantee that HADOOP_PREFIX has been defined at this point or even point to the correct hadoop command. (There are a lot of reasons why, too many to go into here.)

{code}
 +    export PATH=${HADOOP_PREFIX}/bin:${PATH}
{code}


> Bash environment for quick command operations
> ---------------------------------------------
>
>                 Key: HADOOP-12830
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12830
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: bin
>            Reporter: Kazuho Fujii
>            Assignee: Kazuho Fujii
>         Attachments: HADOOP-12830.001.patch
>
>
> Hadoop file system shell commands are slow. This issue is about building a shell environment for quick command operations.
> Previously an interactive shell is tried to build in HADOOP-6541. But, it seems to be poor because users are used to powerful shells like bash. This issue is not about creating a new shell, but just opening a new bash process. Therefore, user can operate commands as before.
> {code}
> fjk@x240:~/hadoop-2.7.2$ ./bin/hadoop shell
> fjk@x240 hadoop> hadoop fs -ls /
> Found 2 items
> -rw-r--r--   3 fjk supergroup          0 2016-02-21 00:26 /file1
> -rw-r--r--   3 fjk supergroup          0 2016-02-21 00:26 /file2
> {code}
> The shell has a mini daemon process that is living until the shell is closed. The hadoop fs command delegates the operation to the daemon. They communicate with named pipes. The daemon conducts the operation and returns the result to the command.
> In this shell the hadoop fs commands operation becomes quick. In a local environment, "hadoop fs -ls" command is about 100 times faster than the normal command.
> {code}
> fjk@x240 hadoop> time hadoop fs -ls hdfs://localhost:8020/ > /dev/null
> real	0m0.021s
> user	0m0.003s
> sys	0m0.011s
> {code}
> Using bash's function, commands and file names are automatically completed.
> {code}
> fjk@x240 hadoop> hadoop fs -ch<TAB><TAB>
> -checksum  -chgrp     -chmod     -chown
> fjk@x240 hadoop> hadoop fs -ls /file<TAB><TAB>
> /file1  /file2  /file3
> {code}
> Additionally, we can make equivalents with bash build-in commands, e.g., cd, umask. In this shell, they can work because the daemon remembers the state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)