You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/12/07 06:08:00 UTC

[jira] [Commented] (HADOOP-15096) start-build-env.sh can create a docker image that fills up disk

    [ https://issues.apache.org/jira/browse/HADOOP-15096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16281372#comment-16281372 ] 

ASF GitHub Bot commented on HADOOP-15096:
-----------------------------------------

GitHub user addisonj opened a pull request:

    https://github.com/apache/hadoop/pull/311

    [HADOOP-15096] Don't create user with lastlog

    This fixes a problem where in certain cases, the useradd command can create a very large diff that can blow up the host disk size.
    
    The reason for this is that lastlog is a sparse file, but AUFS under docker apparently doesn't deal with those well and creates a very large file.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/addisonj/hadoop patch-1

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/hadoop/pull/311.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #311
    
----
commit f5ffd36e78be66d9b075050f948d8af67cf53b56
Author: Addison Higham <ad...@gmail.com>
Date:   2017-12-07T06:07:27Z

    [HADOOP-15096] Don't create user with lastlog
    
    This fixes a problem where in certain cases, the useradd command can create a very large diff that can blow up the host disk size.
    
    The reason for this is that lastlog is a sparse file, but AUFS under docker apparently doesn't deal with those well and creates a very large file.

----


> start-build-env.sh can create a docker image that fills up disk
> ---------------------------------------------------------------
>
>                 Key: HADOOP-15096
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15096
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: build
>    Affects Versions: 3.1.0
>            Reporter: Addison Higham
>
> start-build-env.sh has the potential to build an image that can fill up root disks by exploding a sparse file.
> In my case, the right ingredients are:
> Ubuntu 17.04
> Docker 17.09.0
> AUFS storage driver
> userId and groupid with a high number
> This happens when building the hadoop-build-${USER_ID} image, specifically in the 
> {code}
> RUN useradd -g ${GROUP_ID} -u ${USER_ID} -k /root -m ${USER_NAME}
> {code}
> command.
> The reason for this:
> /var/log/lastlog is a sparse file that pre-reserves based on highest seen UID and GID, in my case, those numbers are very high (above 1 billion). Locally, this result in a sparse file that reports as 443 GB. However, under docker and specifically AUFS, it appears that his file *isn't* sparse and it tries to allocate the whole file.
> If you start this script and walk away to wait for it to finish, you come back to a computer with a completely full disk.
> Luckily, the fix is quite easy, simply add the `-l` option to useradd which won't create those files



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org