You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Casey Ching (JIRA)" <ji...@apache.org> on 2016/04/16 00:22:25 UTC

[jira] [Commented] (KUDU-1419) Kudu may fail to start in docker when using Ubuntu/AUFS

    [ https://issues.apache.org/jira/browse/KUDU-1419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243752#comment-15243752 ] 

Casey Ching commented on KUDU-1419:
-----------------------------------

Oh and this is what the user sees

{noformat}
E0413 17:39:06.583531 10566 master.cc:135] Master@0.0.0.0:7051: Unable to init master catalog manager: IO error: Unable to initialize catalog manager: Failed to initialize sys tables async: Could not move log directory /home/dev/Impala/testdata/cluster/cdh5/node-1/var/lib/kudu/master/wal/wals/00000000000000000000000000000000 to recovery dir /home/dev/Impala/testdata/cluster/cdh5/node-1/var/lib/kudu/master/wal/wals/00000000000000000000000000000000.recovery: /home/dev/Impala/testdata/cluster/cdh5/node-1/var/lib/kudu/master/wal/wals/00000000000000000000000000000000: Invalid cross-device link (error 18)
{noformat}

That is EXDEV

{noformat}
[EXDEV]
[CX] [Option Start] The links named by new and old are on different file systems and the implementation does not support links between file systems.
{noformat}

http://pubs.opengroup.org/onlinepubs/009695399/functions/rename.html

> Kudu may fail to start in docker when using Ubuntu/AUFS
> -------------------------------------------------------
>
>                 Key: KUDU-1419
>                 URL: https://issues.apache.org/jira/browse/KUDU-1419
>             Project: Kudu
>          Issue Type: Bug
>          Components: util
>            Reporter: Casey Ching
>
> By default Ubuntu's docker setup uses AUFS for its storage layer. That leads to problems during startup because rename() may not work in AUFS.
> {quote}
> To rename(2) directory may return EXDEV even if both of src and tgt are on the same aufs. When the rename-src dir exists on multiple branches and the lower dir has child(ren), aufs has to copyup all his children. It can be recursive copyup. Current aufs does not support such huge copyup operation at one time in kernel space, instead produces a warning and returns EXDEV. Generally, mv(1) detects this error and tries mkdir(2) and rename(2) or copy/unlink recursively. So the result is harmless. If your application which issues rename(2) for a directory does not support EXDEV, it will not work on aufs. Also this specification is applied to the case when the src directroy exists on the lower readonly branch and it has child(ren).
> {quote}
> http://aufs.sourceforge.net/aufs.html
> Starting the master may try to rename()
> {code}
>     RETURN_NOT_OK_PREPEND(fs_manager->env()->RenameFile(log_dir, recovery_path),
>                           Substitute("Could not move log directory $0 to recovery dir $1",
>                                      log_dir, recovery_path));
> {code}
> https://github.com/cloudera/kudu/blob/master/src/kudu/tablet/tablet_bootstrap.cc#L597
> {code}
>   virtual Status RenameFile(const std::string& src, const std::string& target) OVERRIDE {
>     TRACE_EVENT2("io", "PosixEnv::RenameFile", "src", src, "dst", target);
>     ThreadRestrictions::AssertIOAllowed();
>     Status result;
>     if (rename(src.c_str(), target.c_str()) != 0) {
>       result = IOError(src, errno);
>     }
>     return result;
>   }
> {code}
> https://github.com/cloudera/kudu/blob/master/src/kudu/util/env_posix.cc#L891
> I think Kudu is supposed to fall back to copy/remove. As an example here is what python does
> {code}
>     try:
>         os.rename(src, real_dst)
>     except OSError:
>         if os.path.isdir(src):
>             if _destinsrc(src, dst):
>                 raise Error, "Cannot move a directory '%s' into itself '%s'." % (src, dst)
>             copytree(src, real_dst, symlinks=True)
>             rmtree(src)
>         else:
>             copy2(src, real_dst)
>             os.unlink(src)
> {code}
> https://hg.python.org/cpython/file/2.7/Lib/shutil.py#l295



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)