You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "William Nouet (JIRA)" <ji...@apache.org> on 2017/07/21 13:38:00 UTC

[jira] [Updated] (NIFI-4213) PutHDFS umask not working

     [ https://issues.apache.org/jira/browse/NIFI-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

William Nouet updated NIFI-4213:
--------------------------------
    Description: 
The PutHDFS permission umask property is not working. The umask is set when the processor is scheduled to run as per below:

    @OnScheduled
    public void onScheduled(ProcessContext context) throws Exception {
        super.abstractOnScheduled(context);

        // Set umask once, to avoid thread safety issues doing it in onTrigger
        final PropertyValue umaskProp = context.getProperty(UMASK);
        final short dfsUmask;
        if (umaskProp.isSet()) {
            dfsUmask = Short.parseShort(umaskProp.getValue(), 8);
        } else {
            dfsUmask = FsPermission.DEFAULT_UMASK;
        }
        *final Configuration conf = getConfiguration();*
        *FsPermission.setUMask(conf, new FsPermission(dfsUmask));*
    }

However, when the flowfile is being processed, a new set of configuration is loaded:
    @Override
    public void onTrigger(ProcessContext context, ProcessSession session) throws ProcessException {
        final FlowFile flowFile = session.get();
        if (flowFile == null) {
            return;
        }

        final FileSystem hdfs = getFileSystem();
        *final Configuration configuration = getConfiguration();*
        ...
    }

This configuration is the one which is going to be used when putting the file to HDFS; hence not grabbing the umask set perviously on onScheduled but only the default one (hdfs-site.xml). Thus, the umask property is irrelevant.

Fix should be easy by externalizing the configuration and grabbing it again in onTrigger or by setting a new hdfsResources in onScheduled.

  was:
The PutHDFS permission umask property is not working. The umask is set when the processor is scheduled to run as per below:

    @OnScheduled
    public void onScheduled(ProcessContext context) throws Exception {
        super.abstractOnScheduled(context);

        // Set umask once, to avoid thread safety issues doing it in onTrigger
        final PropertyValue umaskProp = context.getProperty(UMASK);
        final short dfsUmask;
        if (umaskProp.isSet()) {
            dfsUmask = Short.parseShort(umaskProp.getValue(), 8);
        } else {
            dfsUmask = FsPermission.DEFAULT_UMASK;
        }
        *final Configuration conf = getConfiguration();
        FsPermission.setUMask(conf, new FsPermission(dfsUmask));*
    }

However, when the flowfile is being processed, a new set of configuration is loaded:
    @Override
    public void onTrigger(ProcessContext context, ProcessSession session) throws ProcessException {
        final FlowFile flowFile = session.get();
        if (flowFile == null) {
            return;
        }

        final FileSystem hdfs = getFileSystem();
        *final Configuration configuration = getConfiguration();*
        ...
    }

This configuration is the one which is going to be used when putting the file to HDFS; hence not grabbing the umask set perviously on onScheduled but only the default one (hdfs-site.xml). Thus, the umask property is irrelevant.

Fix should be easy by externalizing the configuration and grabbing it again in onTrigger or by setting a new hdfsResources in onScheduled.


> PutHDFS umask not working
> -------------------------
>
>                 Key: NIFI-4213
>                 URL: https://issues.apache.org/jira/browse/NIFI-4213
>             Project: Apache NiFi
>          Issue Type: Bug
>    Affects Versions: 1.1.1
>            Reporter: William Nouet
>
> The PutHDFS permission umask property is not working. The umask is set when the processor is scheduled to run as per below:
>     @OnScheduled
>     public void onScheduled(ProcessContext context) throws Exception {
>         super.abstractOnScheduled(context);
>         // Set umask once, to avoid thread safety issues doing it in onTrigger
>         final PropertyValue umaskProp = context.getProperty(UMASK);
>         final short dfsUmask;
>         if (umaskProp.isSet()) {
>             dfsUmask = Short.parseShort(umaskProp.getValue(), 8);
>         } else {
>             dfsUmask = FsPermission.DEFAULT_UMASK;
>         }
>         *final Configuration conf = getConfiguration();*
>         *FsPermission.setUMask(conf, new FsPermission(dfsUmask));*
>     }
> However, when the flowfile is being processed, a new set of configuration is loaded:
>     @Override
>     public void onTrigger(ProcessContext context, ProcessSession session) throws ProcessException {
>         final FlowFile flowFile = session.get();
>         if (flowFile == null) {
>             return;
>         }
>         final FileSystem hdfs = getFileSystem();
>         *final Configuration configuration = getConfiguration();*
>         ...
>     }
> This configuration is the one which is going to be used when putting the file to HDFS; hence not grabbing the umask set perviously on onScheduled but only the default one (hdfs-site.xml). Thus, the umask property is irrelevant.
> Fix should be easy by externalizing the configuration and grabbing it again in onTrigger or by setting a new hdfsResources in onScheduled.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)