You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Eric Yang (JIRA)" <ji...@apache.org> on 2011/08/05 19:31:27 UTC

[jira] [Created] (HADOOP-7521) bintar created tarball should use a common directory for prefix

bintar created tarball should use a common directory for prefix
---------------------------------------------------------------

                 Key: HADOOP-7521
                 URL: https://issues.apache.org/jira/browse/HADOOP-7521
             Project: Hadoop Common
          Issue Type: Bug
          Components: build
    Affects Versions: 0.23.0
         Environment: Java 6, Maven, Linux/Mac
            Reporter: Eric Yang


The binary tarball contains the directory structure like:

{noformat}
hadoop-common-0.23.0-SNAPSHOT-bin/bin
                                 /etc/hadoop
                                 /libexec
                                 /sbin
                                 /share/hadoop/common
{noformat}

It would be nice to rename the prefix directory to a common directory where it is common to all Hadoop stack software.  Therefore, user can untar hbase, hadoop, zookeeper, pig, hive all into the same location and run from the top level directory without manually renaming them to the same directory again.

By default the prefix directory can be /usr.  Hence, it could merge with the base OS.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7521) bintar created tarball should use a common directory for prefix

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080135#comment-13080135 ] 

Alejandro Abdelnur commented on HADOOP-7521:
--------------------------------------------

Using usr/ leads to the following layout:

{code}
usr/
   + bin/
   + sbin/
   + etc/
   + libexec/
   + share/
   + lib/
{code}

IMO the layout should be:

{code}
<FOO>/
     + usr/bin/
     + usr/sbin/
     + usr/etc/
     + usr/libexec/
     + usr/share/
     + usr/lib/
     + usr/lib64/
{code}

Where <FOO> could be the same for common, hdfs, mapred.

When doing TARs you have everything under <FOO>

When doing RPM/DEB the packaging logic removes the <FOO> level.

In addition this would play nicely with runtime dirs like /var/log, /var/lib, /var/run which could work out for <FOO> when using TARs and out fo / when doing RPM/DEB.

Thoughts?

PS: I'm OK with using <FOO>/etc/ instead <FOO>/usr/etc/ also, still things are under the <FOO> prefix for TAR usage.


> bintar created tarball should use a common directory for prefix
> ---------------------------------------------------------------
>
>                 Key: HADOOP-7521
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7521
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: build
>    Affects Versions: 0.23.0
>         Environment: Java 6, Maven, Linux/Mac
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>         Attachments: HADOOP-7521.patch
>
>
> The binary tarball contains the directory structure like:
> {noformat}
> hadoop-common-0.23.0-SNAPSHOT-bin/bin
>                                  /etc/hadoop
>                                  /libexec
>                                  /sbin
>                                  /share/hadoop/common
> {noformat}
> It would be nice to rename the prefix directory to a common directory where it is common to all Hadoop stack software.  Therefore, user can untar hbase, hadoop, zookeeper, pig, hive all into the same location and run from the top level directory without manually renaming them to the same directory again.
> By default the prefix directory can be /usr.  Hence, it could merge with the base OS.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7521) bintar created tarball should use a common directory for prefix

Posted by "Eric Yang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080156#comment-13080156 ] 

Eric Yang commented on HADOOP-7521:
-----------------------------------

Allen, the isolated tarball (pkgname-version) is still supported by tar profile.  We are discussing merged layout here.

Alejandro, with your proposal, we introduce an extra directory "usr" than necessary.  The original recommendation supports the layout that you propose where usr = <FOO>/usr.

Hence, it is better to have package.prefix = "usr".  RPM/DEB work already move usr/etc/* to /etc/*, and usr/var/* to /var/* in the build process.  In case if we decided "usr" was not a good choice, it is possible to change by override the maven property.

> bintar created tarball should use a common directory for prefix
> ---------------------------------------------------------------
>
>                 Key: HADOOP-7521
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7521
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: build
>    Affects Versions: 0.23.0
>         Environment: Java 6, Maven, Linux/Mac
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>         Attachments: HADOOP-7521.patch
>
>
> The binary tarball contains the directory structure like:
> {noformat}
> hadoop-common-0.23.0-SNAPSHOT-bin/bin
>                                  /etc/hadoop
>                                  /libexec
>                                  /sbin
>                                  /share/hadoop/common
> {noformat}
> It would be nice to rename the prefix directory to a common directory where it is common to all Hadoop stack software.  Therefore, user can untar hbase, hadoop, zookeeper, pig, hive all into the same location and run from the top level directory without manually renaming them to the same directory again.
> By default the prefix directory can be /usr.  Hence, it could merge with the base OS.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HADOOP-7521) bintar created tarball should use a common directory for prefix

Posted by "Eric Yang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eric Yang updated HADOOP-7521:
------------------------------

    Attachment: HADOOP-7521.patch

Set default prefix directory for binary tarball to "usr"

> bintar created tarball should use a common directory for prefix
> ---------------------------------------------------------------
>
>                 Key: HADOOP-7521
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7521
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: build
>    Affects Versions: 0.23.0
>         Environment: Java 6, Maven, Linux/Mac
>            Reporter: Eric Yang
>         Attachments: HADOOP-7521.patch
>
>
> The binary tarball contains the directory structure like:
> {noformat}
> hadoop-common-0.23.0-SNAPSHOT-bin/bin
>                                  /etc/hadoop
>                                  /libexec
>                                  /sbin
>                                  /share/hadoop/common
> {noformat}
> It would be nice to rename the prefix directory to a common directory where it is common to all Hadoop stack software.  Therefore, user can untar hbase, hadoop, zookeeper, pig, hive all into the same location and run from the top level directory without manually renaming them to the same directory again.
> By default the prefix directory can be /usr.  Hence, it could merge with the base OS.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7521) bintar created tarball should use a common directory for prefix

Posted by "Milind Bhandarkar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080239#comment-13080239 ] 

Milind Bhandarkar commented on HADOOP-7521:
-------------------------------------------

Eric, that was just an example. How does one make sure that in all of the hadoop stack's components, no two files are named the same if they are part of say bin, sbin, etc ?

> bintar created tarball should use a common directory for prefix
> ---------------------------------------------------------------
>
>                 Key: HADOOP-7521
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7521
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: build
>    Affects Versions: 0.23.0
>         Environment: Java 6, Maven, Linux/Mac
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>         Attachments: HADOOP-7521.patch
>
>
> The binary tarball contains the directory structure like:
> {noformat}
> hadoop-common-0.23.0-SNAPSHOT-bin/bin
>                                  /etc/hadoop
>                                  /libexec
>                                  /sbin
>                                  /share/hadoop/common
> {noformat}
> It would be nice to rename the prefix directory to a common directory where it is common to all Hadoop stack software.  Therefore, user can untar hbase, hadoop, zookeeper, pig, hive all into the same location and run from the top level directory without manually renaming them to the same directory again.
> By default the prefix directory can be /usr.  Hence, it could merge with the base OS.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7521) bintar created tarball should use a common directory for prefix

Posted by "Milind Bhandarkar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080244#comment-13080244 ] 

Milind Bhandarkar commented on HADOOP-7521:
-------------------------------------------

FWIW, here is a definition of "tarbomb" from wikipedia:

A tarbomb is derogatory hacker slang used to refer to a tarball that does not follow the usual conventions, i.e. it contains many files that extract into the working directory. Such a tarball can create problems by overwriting files of the same name in the working directory, or mixing one project's files into another. It is almost always an inconvenience to the user, who is obliged to identify and delete a number of files scattered throughout the directory's contents. Such behavior is considered bad etiquette on the part of the archive's creator.

> bintar created tarball should use a common directory for prefix
> ---------------------------------------------------------------
>
>                 Key: HADOOP-7521
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7521
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: build
>    Affects Versions: 0.23.0
>         Environment: Java 6, Maven, Linux/Mac
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>         Attachments: HADOOP-7521.patch
>
>
> The binary tarball contains the directory structure like:
> {noformat}
> hadoop-common-0.23.0-SNAPSHOT-bin/bin
>                                  /etc/hadoop
>                                  /libexec
>                                  /sbin
>                                  /share/hadoop/common
> {noformat}
> It would be nice to rename the prefix directory to a common directory where it is common to all Hadoop stack software.  Therefore, user can untar hbase, hadoop, zookeeper, pig, hive all into the same location and run from the top level directory without manually renaming them to the same directory again.
> By default the prefix directory can be /usr.  Hence, it could merge with the base OS.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7521) bintar created tarball should use a common directory for prefix

Posted by "Eric Yang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080247#comment-13080247 ] 

Eric Yang commented on HADOOP-7521:
-----------------------------------

bq. Eric, that was just an example. How does one make sure that in all of the hadoop stack's components, no two files are named the same if they are part of say bin, sbin, etc ?

This is the responsibility of the community to ensure that there is no conflict in the file name.  This is a general rule of thumb for developers to ensure their file names are not conflicting with existing system.

> bintar created tarball should use a common directory for prefix
> ---------------------------------------------------------------
>
>                 Key: HADOOP-7521
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7521
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: build
>    Affects Versions: 0.23.0
>         Environment: Java 6, Maven, Linux/Mac
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>         Attachments: HADOOP-7521.patch
>
>
> The binary tarball contains the directory structure like:
> {noformat}
> hadoop-common-0.23.0-SNAPSHOT-bin/bin
>                                  /etc/hadoop
>                                  /libexec
>                                  /sbin
>                                  /share/hadoop/common
> {noformat}
> It would be nice to rename the prefix directory to a common directory where it is common to all Hadoop stack software.  Therefore, user can untar hbase, hadoop, zookeeper, pig, hive all into the same location and run from the top level directory without manually renaming them to the same directory again.
> By default the prefix directory can be /usr.  Hence, it could merge with the base OS.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7521) bintar created tarball should use a common directory for prefix

Posted by "Allen Wittenauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080121#comment-13080121 ] 

Allen Wittenauer commented on HADOOP-7521:
------------------------------------------

-1

This completely breaks with customary tar ball behavior.  The expectation that when you unpack a tarball is that it will be in (pkgname)-(version).  Users are *expecting* to have component separation and in many cases *prefer* component separation.  If someone wants a more integrated experience, they'll use the rpm, deb, etc, packaging.



> bintar created tarball should use a common directory for prefix
> ---------------------------------------------------------------
>
>                 Key: HADOOP-7521
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7521
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: build
>    Affects Versions: 0.23.0
>         Environment: Java 6, Maven, Linux/Mac
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>         Attachments: HADOOP-7521.patch
>
>
> The binary tarball contains the directory structure like:
> {noformat}
> hadoop-common-0.23.0-SNAPSHOT-bin/bin
>                                  /etc/hadoop
>                                  /libexec
>                                  /sbin
>                                  /share/hadoop/common
> {noformat}
> It would be nice to rename the prefix directory to a common directory where it is common to all Hadoop stack software.  Therefore, user can untar hbase, hadoop, zookeeper, pig, hive all into the same location and run from the top level directory without manually renaming them to the same directory again.
> By default the prefix directory can be /usr.  Hence, it could merge with the base OS.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7521) bintar created tarball should use a common directory for prefix

Posted by "Eric Yang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080254#comment-13080254 ] 

Eric Yang commented on HADOOP-7521:
-----------------------------------

Milind, we are not creating tarbomb here.  A tarbomb means the tarball does not have a PREFIX directory.  It extracts many files into the working directory.  That is not what has been proposed here.  We are proposing a tarball with a PREFIX directory named usr instead of hadoop-[module]-[version].  See: http://www.linfo.org/tarbomb.html for more concise information about tarbomb.

> bintar created tarball should use a common directory for prefix
> ---------------------------------------------------------------
>
>                 Key: HADOOP-7521
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7521
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: build
>    Affects Versions: 0.23.0
>         Environment: Java 6, Maven, Linux/Mac
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>         Attachments: HADOOP-7521.patch
>
>
> The binary tarball contains the directory structure like:
> {noformat}
> hadoop-common-0.23.0-SNAPSHOT-bin/bin
>                                  /etc/hadoop
>                                  /libexec
>                                  /sbin
>                                  /share/hadoop/common
> {noformat}
> It would be nice to rename the prefix directory to a common directory where it is common to all Hadoop stack software.  Therefore, user can untar hbase, hadoop, zookeeper, pig, hive all into the same location and run from the top level directory without manually renaming them to the same directory again.
> By default the prefix directory can be /usr.  Hence, it could merge with the base OS.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HADOOP-7521) bintar created tarball should use a common directory for prefix

Posted by "Eric Yang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eric Yang updated HADOOP-7521:
------------------------------

    Assignee: Eric Yang
      Status: Patch Available  (was: Open)

> bintar created tarball should use a common directory for prefix
> ---------------------------------------------------------------
>
>                 Key: HADOOP-7521
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7521
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: build
>    Affects Versions: 0.23.0
>         Environment: Java 6, Maven, Linux/Mac
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>         Attachments: HADOOP-7521.patch
>
>
> The binary tarball contains the directory structure like:
> {noformat}
> hadoop-common-0.23.0-SNAPSHOT-bin/bin
>                                  /etc/hadoop
>                                  /libexec
>                                  /sbin
>                                  /share/hadoop/common
> {noformat}
> It would be nice to rename the prefix directory to a common directory where it is common to all Hadoop stack software.  Therefore, user can untar hbase, hadoop, zookeeper, pig, hive all into the same location and run from the top level directory without manually renaming them to the same directory again.
> By default the prefix directory can be /usr.  Hence, it could merge with the base OS.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7521) bintar created tarball should use a common directory for prefix

Posted by "Milind Bhandarkar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080723#comment-13080723 ] 

Milind Bhandarkar commented on HADOOP-7521:
-------------------------------------------

E4,

I think @aw made it very clear that tar should not be treated as rpm. But in any case, I will make it explicit by suggesting a "workaround" for *inconsiderate packaging* (whatever that means :-).

Every project should untar into <project-name>-<version-number> directory. (Since tar is a short form of "tape archive", and tapes historically had a write-once-read-many property, a tar has exactly the same layout when expanded. So, if a tar is named xyz.tar, it has the same bits as any other xyz.tar, irrespective of the media it was stored on. If the name is the same, I feel comfortable expanding it again and again, because I am sure of that untarring is idempotent.)

User (in this case, untarrer) sets an environment variable "PROJECTNAME_HOME=/full/path/to/<project-name>-<version-number>". For example, HADOOP_COMMON_HOME=/opt/local/hadoop-common-0.22.0, and PIG_HOME=/opt/local/pig-0.9, etc. Then, as long as the binaries are found in $XYZ_HOME/bin, and jars are found in $XYZ_HOME/libexec, and dependencies are found in $XYZ_HOME/lib, and configs are found in $XYZ_HOME/config, one can write scripts that check for the presence of $XYZ_HOME and adjust locations accordingly.

If $XYZ_HOME is not set, path is checked for the presence of the primary executable, such as "hadoop" for HADOOP, HDFS, and MAPREDUCE, and "pig" for PIG. If it is symlinked, the origin is discovered, and $XYZ_HOME is discovered based on that. For example, if the shell script "hadoop" is in /opt/local/bin, which is in path, and it is a symlink from /opt/hstack/hadoop-0.22.0/bin/hadoop, then /opt/hstack/hadoop-0.22.0 is taken to be $HADOOP_HOME, and rest of the links are resolved accordingly.

The point is, for the last 40 years, people are accustomed to certain conventions. So let's not try to trash those conventions, unless we have to, and this case does not demand trashing conventions.

Right ?

(I think I have spent more time on this than it deserves. So this is my last comment on this issue.)


> bintar created tarball should use a common directory for prefix
> ---------------------------------------------------------------
>
>                 Key: HADOOP-7521
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7521
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: build
>    Affects Versions: 0.23.0
>         Environment: Java 6, Maven, Linux/Mac
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>         Attachments: HADOOP-7521.patch
>
>
> The binary tarball contains the directory structure like:
> {noformat}
> hadoop-common-0.23.0-SNAPSHOT-bin/bin
>                                  /etc/hadoop
>                                  /libexec
>                                  /sbin
>                                  /share/hadoop/common
> {noformat}
> It would be nice to rename the prefix directory to a common directory where it is common to all Hadoop stack software.  Therefore, user can untar hbase, hadoop, zookeeper, pig, hive all into the same location and run from the top level directory without manually renaming them to the same directory again.
> By default the prefix directory can be /usr.  Hence, it could merge with the base OS.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7521) bintar created tarball should use a common directory for prefix

Posted by "Eric Yang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080751#comment-13080751 ] 

Eric Yang commented on HADOOP-7521:
-----------------------------------

Milind, your recommendation described 80% of Hadoop 0.21 structure with few modification.  Unfortunately, this is not scalable when deploying several dozen of software.  Managing 10-30 environment variables for all users across thousands of machines.  This approach becomes non trivial and unlikely to survive long term upgrades.  This is the reason that we are creating the new layout which aligns with rpm/deb package layout.  My intention is not to force the new layout on existing system, rather, this package layout is used for constructing rpm/deb packages.  You should be more interested in HADOOP-7498 if you want to preserve $*_HOME in the release tarball rather than binary tarball.

> bintar created tarball should use a common directory for prefix
> ---------------------------------------------------------------
>
>                 Key: HADOOP-7521
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7521
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: build
>    Affects Versions: 0.23.0
>         Environment: Java 6, Maven, Linux/Mac
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>         Attachments: HADOOP-7521.patch
>
>
> The binary tarball contains the directory structure like:
> {noformat}
> hadoop-common-0.23.0-SNAPSHOT-bin/bin
>                                  /etc/hadoop
>                                  /libexec
>                                  /sbin
>                                  /share/hadoop/common
> {noformat}
> It would be nice to rename the prefix directory to a common directory where it is common to all Hadoop stack software.  Therefore, user can untar hbase, hadoop, zookeeper, pig, hive all into the same location and run from the top level directory without manually renaming them to the same directory again.
> By default the prefix directory can be /usr.  Hence, it could merge with the base OS.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7521) bintar created tarball should use a common directory for prefix

Posted by "Eric Yang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080191#comment-13080191 ] 

Eric Yang commented on HADOOP-7521:
-----------------------------------

bq. You mean other than the fact that few to no other tarball on the Internet does this?

Python setup tools and Rubygems both supports this.  It is standard practice with popular Ops tools.  Download any python software, and run "python setup.py bdist_dumb", and it generates a binary tarball with prefix to root level of the OS.  What we propose here is standard.  

This change only minimize repetitive tasks in managing $PROJECT_HOME environment variable.  The original tarball design is not scalable for the long term.  Hadoop is a growing eco-system, we should reduce rough edges to improve adoption.

> bintar created tarball should use a common directory for prefix
> ---------------------------------------------------------------
>
>                 Key: HADOOP-7521
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7521
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: build
>    Affects Versions: 0.23.0
>         Environment: Java 6, Maven, Linux/Mac
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>         Attachments: HADOOP-7521.patch
>
>
> The binary tarball contains the directory structure like:
> {noformat}
> hadoop-common-0.23.0-SNAPSHOT-bin/bin
>                                  /etc/hadoop
>                                  /libexec
>                                  /sbin
>                                  /share/hadoop/common
> {noformat}
> It would be nice to rename the prefix directory to a common directory where it is common to all Hadoop stack software.  Therefore, user can untar hbase, hadoop, zookeeper, pig, hive all into the same location and run from the top level directory without manually renaming them to the same directory again.
> By default the prefix directory can be /usr.  Hence, it could merge with the base OS.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7521) bintar created tarball should use a common directory for prefix

Posted by "Allen Wittenauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080251#comment-13080251 ] 

Allen Wittenauer commented on HADOOP-7521:
------------------------------------------

Beyond just the tarbomb problem, you've got file and permission problems.

> bintar created tarball should use a common directory for prefix
> ---------------------------------------------------------------
>
>                 Key: HADOOP-7521
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7521
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: build
>    Affects Versions: 0.23.0
>         Environment: Java 6, Maven, Linux/Mac
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>         Attachments: HADOOP-7521.patch
>
>
> The binary tarball contains the directory structure like:
> {noformat}
> hadoop-common-0.23.0-SNAPSHOT-bin/bin
>                                  /etc/hadoop
>                                  /libexec
>                                  /sbin
>                                  /share/hadoop/common
> {noformat}
> It would be nice to rename the prefix directory to a common directory where it is common to all Hadoop stack software.  Therefore, user can untar hbase, hadoop, zookeeper, pig, hive all into the same location and run from the top level directory without manually renaming them to the same directory again.
> By default the prefix directory can be /usr.  Hence, it could merge with the base OS.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7521) bintar created tarball should use a common directory for prefix

Posted by "Allen Wittenauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080202#comment-13080202 ] 

Allen Wittenauer commented on HADOOP-7521:
------------------------------------------

bq. It is standard practice with popular Ops tools. 

Yet your examples are dev tools.  

-1 remains.  Might as well close this as won't fix.

> bintar created tarball should use a common directory for prefix
> ---------------------------------------------------------------
>
>                 Key: HADOOP-7521
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7521
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: build
>    Affects Versions: 0.23.0
>         Environment: Java 6, Maven, Linux/Mac
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>         Attachments: HADOOP-7521.patch
>
>
> The binary tarball contains the directory structure like:
> {noformat}
> hadoop-common-0.23.0-SNAPSHOT-bin/bin
>                                  /etc/hadoop
>                                  /libexec
>                                  /sbin
>                                  /share/hadoop/common
> {noformat}
> It would be nice to rename the prefix directory to a common directory where it is common to all Hadoop stack software.  Therefore, user can untar hbase, hadoop, zookeeper, pig, hive all into the same location and run from the top level directory without manually renaming them to the same directory again.
> By default the prefix directory can be /usr.  Hence, it could merge with the base OS.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7521) bintar created tarball should use a common directory for prefix

Posted by "Milind Bhandarkar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080257#comment-13080257 ] 

Milind Bhandarkar commented on HADOOP-7521:
-------------------------------------------

Eric, From the link you mentioned, first para (I have highlighted the "or some other existing dir" part):

A tarbomb, also sometimes written as tar bomb, is a tarball whose contents appear to explode into the current directory *or some other existing directory* containing a large number of items when untarred rather than into a new directory created by the tarball specifically for such contents.

As a unix user for the last 25 years, my expectation would be that each untar creates a new directory under $cwd rather than unpacking into an existing directory.

So, I am -1 on this.

> bintar created tarball should use a common directory for prefix
> ---------------------------------------------------------------
>
>                 Key: HADOOP-7521
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7521
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: build
>    Affects Versions: 0.23.0
>         Environment: Java 6, Maven, Linux/Mac
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>         Attachments: HADOOP-7521.patch
>
>
> The binary tarball contains the directory structure like:
> {noformat}
> hadoop-common-0.23.0-SNAPSHOT-bin/bin
>                                  /etc/hadoop
>                                  /libexec
>                                  /sbin
>                                  /share/hadoop/common
> {noformat}
> It would be nice to rename the prefix directory to a common directory where it is common to all Hadoop stack software.  Therefore, user can untar hbase, hadoop, zookeeper, pig, hive all into the same location and run from the top level directory without manually renaming them to the same directory again.
> By default the prefix directory can be /usr.  Hence, it could merge with the base OS.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7521) bintar created tarball should use a common directory for prefix

Posted by "Eric Yang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080363#comment-13080363 ] 

Eric Yang commented on HADOOP-7521:
-----------------------------------

Milind, by your definition of tarbomb, this means any tarball is also a tarbomb.  As long as the $cwd contains the same directory name in the tarball.  I extract hadoop-0.23 tarball to $cwd, then I should never do it again in the same directory because that tarball becomes a tarbomb by definition.  All debian and slackware packages are composed of tarbombs by the same definition.  Rather than trolling inconsiderate packaging (far from intent of this jira), I suggest we stay objective on how to improve Hadoop software stack interoperability.  I am open to suggestions.

> bintar created tarball should use a common directory for prefix
> ---------------------------------------------------------------
>
>                 Key: HADOOP-7521
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7521
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: build
>    Affects Versions: 0.23.0
>         Environment: Java 6, Maven, Linux/Mac
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>         Attachments: HADOOP-7521.patch
>
>
> The binary tarball contains the directory structure like:
> {noformat}
> hadoop-common-0.23.0-SNAPSHOT-bin/bin
>                                  /etc/hadoop
>                                  /libexec
>                                  /sbin
>                                  /share/hadoop/common
> {noformat}
> It would be nice to rename the prefix directory to a common directory where it is common to all Hadoop stack software.  Therefore, user can untar hbase, hadoop, zookeeper, pig, hive all into the same location and run from the top level directory without manually renaming them to the same directory again.
> By default the prefix directory can be /usr.  Hence, it could merge with the base OS.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7521) bintar created tarball should use a common directory for prefix

Posted by "Eric Yang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080227#comment-13080227 ] 

Eric Yang commented on HADOOP-7521:
-----------------------------------

Milind, CHANGES.txt are moved to $PREFIX/share/doc/$PROJECT/CHANGES.txt.  Hence, it does not overwrite other project's files.

> bintar created tarball should use a common directory for prefix
> ---------------------------------------------------------------
>
>                 Key: HADOOP-7521
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7521
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: build
>    Affects Versions: 0.23.0
>         Environment: Java 6, Maven, Linux/Mac
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>         Attachments: HADOOP-7521.patch
>
>
> The binary tarball contains the directory structure like:
> {noformat}
> hadoop-common-0.23.0-SNAPSHOT-bin/bin
>                                  /etc/hadoop
>                                  /libexec
>                                  /sbin
>                                  /share/hadoop/common
> {noformat}
> It would be nice to rename the prefix directory to a common directory where it is common to all Hadoop stack software.  Therefore, user can untar hbase, hadoop, zookeeper, pig, hive all into the same location and run from the top level directory without manually renaming them to the same directory again.
> By default the prefix directory can be /usr.  Hence, it could merge with the base OS.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7521) bintar created tarball should use a common directory for prefix

Posted by "Milind Bhandarkar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080206#comment-13080206 ] 

Milind Bhandarkar commented on HADOOP-7521:
-------------------------------------------

When I untar in $cwd, my reasonable expectation is that a new directory is created, and my existing directories are not overwritten. So, even by mistake, a hadoop CHANGES.txt is not overwritten by pig CHANGES.txt etc. I completely agree with Allen. tar should not be treated like rpm.

> bintar created tarball should use a common directory for prefix
> ---------------------------------------------------------------
>
>                 Key: HADOOP-7521
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7521
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: build
>    Affects Versions: 0.23.0
>         Environment: Java 6, Maven, Linux/Mac
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>         Attachments: HADOOP-7521.patch
>
>
> The binary tarball contains the directory structure like:
> {noformat}
> hadoop-common-0.23.0-SNAPSHOT-bin/bin
>                                  /etc/hadoop
>                                  /libexec
>                                  /sbin
>                                  /share/hadoop/common
> {noformat}
> It would be nice to rename the prefix directory to a common directory where it is common to all Hadoop stack software.  Therefore, user can untar hbase, hadoop, zookeeper, pig, hive all into the same location and run from the top level directory without manually renaming them to the same directory again.
> By default the prefix directory can be /usr.  Hence, it could merge with the base OS.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7521) bintar created tarball should use a common directory for prefix

Posted by "Eric Yang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080179#comment-13080179 ] 

Eric Yang commented on HADOOP-7521:
-----------------------------------

bq. then I'm still -1.
bq. That is just a flawed idea to try to treat tar as equivalent of rpm. They aren't.

Could you provide the reason why this is not a good idea?

> bintar created tarball should use a common directory for prefix
> ---------------------------------------------------------------
>
>                 Key: HADOOP-7521
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7521
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: build
>    Affects Versions: 0.23.0
>         Environment: Java 6, Maven, Linux/Mac
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>         Attachments: HADOOP-7521.patch
>
>
> The binary tarball contains the directory structure like:
> {noformat}
> hadoop-common-0.23.0-SNAPSHOT-bin/bin
>                                  /etc/hadoop
>                                  /libexec
>                                  /sbin
>                                  /share/hadoop/common
> {noformat}
> It would be nice to rename the prefix directory to a common directory where it is common to all Hadoop stack software.  Therefore, user can untar hbase, hadoop, zookeeper, pig, hive all into the same location and run from the top level directory without manually renaming them to the same directory again.
> By default the prefix directory can be /usr.  Hence, it could merge with the base OS.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HADOOP-7521) bintar created tarball should use a common directory for prefix

Posted by "Eric Yang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eric Yang updated HADOOP-7521:
------------------------------

    Resolution: Not A Problem
        Status: Resolved  (was: Patch Available)

Closing this as "not a problem".  I respect the community's opinion on this matter.  For merged layout, user can run "tar -xv --strip-components 1 -f ../*.tar.gz" on their own.

> bintar created tarball should use a common directory for prefix
> ---------------------------------------------------------------
>
>                 Key: HADOOP-7521
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7521
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: build
>    Affects Versions: 0.23.0
>         Environment: Java 6, Maven, Linux/Mac
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>         Attachments: HADOOP-7521.patch
>
>
> The binary tarball contains the directory structure like:
> {noformat}
> hadoop-common-0.23.0-SNAPSHOT-bin/bin
>                                  /etc/hadoop
>                                  /libexec
>                                  /sbin
>                                  /share/hadoop/common
> {noformat}
> It would be nice to rename the prefix directory to a common directory where it is common to all Hadoop stack software.  Therefore, user can untar hbase, hadoop, zookeeper, pig, hive all into the same location and run from the top level directory without manually renaming them to the same directory again.
> By default the prefix directory can be /usr.  Hence, it could merge with the base OS.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7521) bintar created tarball should use a common directory for prefix

Posted by "Allen Wittenauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080161#comment-13080161 ] 

Allen Wittenauer commented on HADOOP-7521:
------------------------------------------

bq. Allen, the isolated tarball (pkgname-version) is still supported by tar profile. We are discussing merged layout here.

If "merged layout" is:

bq. Therefore, user can untar hbase, hadoop, zookeeper, pig, hive all into the same location and run from the top level directory without manually renaming them to the same directory again.

then I'm still -1.

That is just a flawed idea to try to treat tar as equivalent of rpm.  They aren't.



> bintar created tarball should use a common directory for prefix
> ---------------------------------------------------------------
>
>                 Key: HADOOP-7521
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7521
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: build
>    Affects Versions: 0.23.0
>         Environment: Java 6, Maven, Linux/Mac
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>         Attachments: HADOOP-7521.patch
>
>
> The binary tarball contains the directory structure like:
> {noformat}
> hadoop-common-0.23.0-SNAPSHOT-bin/bin
>                                  /etc/hadoop
>                                  /libexec
>                                  /sbin
>                                  /share/hadoop/common
> {noformat}
> It would be nice to rename the prefix directory to a common directory where it is common to all Hadoop stack software.  Therefore, user can untar hbase, hadoop, zookeeper, pig, hive all into the same location and run from the top level directory without manually renaming them to the same directory again.
> By default the prefix directory can be /usr.  Hence, it could merge with the base OS.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7521) bintar created tarball should use a common directory for prefix

Posted by "Allen Wittenauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080183#comment-13080183 ] 

Allen Wittenauer commented on HADOOP-7521:
------------------------------------------

You mean other than the fact that few to no other tarball on the Internet does this? 

People who use binary tarballs to deploy things where there is an RPM almost always want package separation and higher levels of control of where things get placed. Changing this paradigm is going to be surprising and counter to those end user goals.

In other words:  This isn't broke.  Stop trying to fix it.

> bintar created tarball should use a common directory for prefix
> ---------------------------------------------------------------
>
>                 Key: HADOOP-7521
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7521
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: build
>    Affects Versions: 0.23.0
>         Environment: Java 6, Maven, Linux/Mac
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>         Attachments: HADOOP-7521.patch
>
>
> The binary tarball contains the directory structure like:
> {noformat}
> hadoop-common-0.23.0-SNAPSHOT-bin/bin
>                                  /etc/hadoop
>                                  /libexec
>                                  /sbin
>                                  /share/hadoop/common
> {noformat}
> It would be nice to rename the prefix directory to a common directory where it is common to all Hadoop stack software.  Therefore, user can untar hbase, hadoop, zookeeper, pig, hive all into the same location and run from the top level directory without manually renaming them to the same directory again.
> By default the prefix directory can be /usr.  Hence, it could merge with the base OS.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira