You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Tim Armstrong (JIRA)" <ji...@apache.org> on 2018/06/06 23:42:00 UTC

[jira] [Created] (IMPALA-7138) Impala gets confused by multiple scratch directories on Device Mapper volumes

Tim Armstrong created IMPALA-7138:
-------------------------------------

             Summary: Impala gets confused by multiple scratch directories on Device Mapper volumes
                 Key: IMPALA-7138
                 URL: https://issues.apache.org/jira/browse/IMPALA-7138
             Project: IMPALA
          Issue Type: Bug
          Components: Backend
    Affects Versions: Impala 2.12.0, Impala 3.0
            Reporter: Tim Armstrong
            Assignee: Tim Armstrong


I've heard a couple of instances of Impala doing the wrong thing when multiple scratch directories are present on a single DeviceMapper logical volume. E.g. in the following setup, we only use one directory as a disk.
{noformat}
>> mount list
/dev/mapper/xx01-lv_data01 on /data01 type ext4 (rw,relatime,data=ordered)
/dev/mapper/xx02-lv_data02 on /data02 type ext4 (rw,relatime,data=ordered)
/dev/mapper/xx03-lv_data03 on /data03 type ext4 (rw,relatime,data=ordered)
/dev/mapper/xx04-lv_data04 on /data04 type ext4 (rw,relatime,data=ordered)
/dev/mapper/xx05-lv_data05 on /data05 type ext4 (rw,relatime,data=ordered)
/dev/mapper/xx06-lv_data06 on /data06 type ext4 (rw,relatime,data=ordered)
/dev/mapper/xx07-lv_data07 on /data07 type ext4 (rw,relatime,data=ordered)
/dev/mapper/xx08-lv_data08 on /data08 type ext4 (rw,relatime,data=ordered)
/dev/mapper/xx09-lv_data09 on /data09 type ext4 (rw,relatime,data=ordered)
/dev/mapper/xx10-lv_data10 on /data10 type ext4 (rw,relatime,data=ordered)
/dev/mapper/xx11-lv_data11 on /data11 type ext4 (rw,relatime,data=ordered)
/dev/mapper/xx12-lv_data12 on /data12 type ext4 (rw,relatime,data=ordered)

Scratch dirs are:
/data01/impala/impalad,/data02/impala/impalad,/data03/impala/impalad,/data04/impala/impalad,/data05/impala/impalad,/data06/impala/impalad,/data07/impala/impalad,/data08/impala/impalad,/data09/impala/impalad,/data10/impala/impalad,/data11/impala/impalad,/data12/impala/impalad


~~~ tmp-file-mgr.cc:122] Using scratch directory /data01/impala/impalad/impala-scratch on disk13. 
{noformat}

A workaround is to set --allow_multiple_scratch_dirs_per_device=true

There are a few questions here:
# Does the deduplication logic even make sense to have?
# Do we need to fix how we treat Device mapper volumes?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)