You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "William Nouet (JIRA)" <ji...@apache.org> on 2017/07/21 13:38:00 UTC
[jira] [Updated] (NIFI-4213) PutHDFS umask not working
[ https://issues.apache.org/jira/browse/NIFI-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
William Nouet updated NIFI-4213:
--------------------------------
Description:
The PutHDFS permission umask property is not working. The umask is set when the processor is scheduled to run as per below:
@OnScheduled
public void onScheduled(ProcessContext context) throws Exception {
super.abstractOnScheduled(context);
// Set umask once, to avoid thread safety issues doing it in onTrigger
final PropertyValue umaskProp = context.getProperty(UMASK);
final short dfsUmask;
if (umaskProp.isSet()) {
dfsUmask = Short.parseShort(umaskProp.getValue(), 8);
} else {
dfsUmask = FsPermission.DEFAULT_UMASK;
}
*final Configuration conf = getConfiguration();*
*FsPermission.setUMask(conf, new FsPermission(dfsUmask));*
}
However, when the flowfile is being processed, a new set of configuration is loaded:
@Override
public void onTrigger(ProcessContext context, ProcessSession session) throws ProcessException {
final FlowFile flowFile = session.get();
if (flowFile == null) {
return;
}
final FileSystem hdfs = getFileSystem();
*final Configuration configuration = getConfiguration();*
...
}
This configuration is the one which is going to be used when putting the file to HDFS; hence not grabbing the umask set perviously on onScheduled but only the default one (hdfs-site.xml). Thus, the umask property is irrelevant.
Fix should be easy by externalizing the configuration and grabbing it again in onTrigger or by setting a new hdfsResources in onScheduled.
was:
The PutHDFS permission umask property is not working. The umask is set when the processor is scheduled to run as per below:
@OnScheduled
public void onScheduled(ProcessContext context) throws Exception {
super.abstractOnScheduled(context);
// Set umask once, to avoid thread safety issues doing it in onTrigger
final PropertyValue umaskProp = context.getProperty(UMASK);
final short dfsUmask;
if (umaskProp.isSet()) {
dfsUmask = Short.parseShort(umaskProp.getValue(), 8);
} else {
dfsUmask = FsPermission.DEFAULT_UMASK;
}
*final Configuration conf = getConfiguration();
FsPermission.setUMask(conf, new FsPermission(dfsUmask));*
}
However, when the flowfile is being processed, a new set of configuration is loaded:
@Override
public void onTrigger(ProcessContext context, ProcessSession session) throws ProcessException {
final FlowFile flowFile = session.get();
if (flowFile == null) {
return;
}
final FileSystem hdfs = getFileSystem();
*final Configuration configuration = getConfiguration();*
...
}
This configuration is the one which is going to be used when putting the file to HDFS; hence not grabbing the umask set perviously on onScheduled but only the default one (hdfs-site.xml). Thus, the umask property is irrelevant.
Fix should be easy by externalizing the configuration and grabbing it again in onTrigger or by setting a new hdfsResources in onScheduled.
> PutHDFS umask not working
> -------------------------
>
> Key: NIFI-4213
> URL: https://issues.apache.org/jira/browse/NIFI-4213
> Project: Apache NiFi
> Issue Type: Bug
> Affects Versions: 1.1.1
> Reporter: William Nouet
>
> The PutHDFS permission umask property is not working. The umask is set when the processor is scheduled to run as per below:
> @OnScheduled
> public void onScheduled(ProcessContext context) throws Exception {
> super.abstractOnScheduled(context);
> // Set umask once, to avoid thread safety issues doing it in onTrigger
> final PropertyValue umaskProp = context.getProperty(UMASK);
> final short dfsUmask;
> if (umaskProp.isSet()) {
> dfsUmask = Short.parseShort(umaskProp.getValue(), 8);
> } else {
> dfsUmask = FsPermission.DEFAULT_UMASK;
> }
> *final Configuration conf = getConfiguration();*
> *FsPermission.setUMask(conf, new FsPermission(dfsUmask));*
> }
> However, when the flowfile is being processed, a new set of configuration is loaded:
> @Override
> public void onTrigger(ProcessContext context, ProcessSession session) throws ProcessException {
> final FlowFile flowFile = session.get();
> if (flowFile == null) {
> return;
> }
> final FileSystem hdfs = getFileSystem();
> *final Configuration configuration = getConfiguration();*
> ...
> }
> This configuration is the one which is going to be used when putting the file to HDFS; hence not grabbing the umask set perviously on onScheduled but only the default one (hdfs-site.xml). Thus, the umask property is irrelevant.
> Fix should be easy by externalizing the configuration and grabbing it again in onTrigger or by setting a new hdfsResources in onScheduled.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)