You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Arturo Michel <Ar...@leotech.com.sg> on 2016/05/11 14:39:18 UTC

CreateHadoopSequenceFile Processor Key adding sf suffix

I am using the createHadoopSequenceFile processor to create a sequence file from incoming data to effectively time stamp my data at this point, using the current time as the key and the data as the value of the sequence file. I change the file name attribute (momentarily) to ${now()} as to get a sequence file where the key is the time and the content is the data. However the processor adds the .sf suffix which makes it all the way to the key.


I end up with the following structure [40668712567.sf | [data bytes]].


I understand that the file is written as filename.sf but shouldn't the key omit the .sf suffix and only be the file name?


Looking at the code in https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/CreateHadoopSequenceFile.java


155     final String fileName = flowFile.getAttribute(CoreAttributes.FILENAME.key()) + ".sf";
156     flowFile = session.putAttribute(flowFile, CoreAttributes.FILENAME.key(), fileName);
157        try {
158            flowFile = sequenceFileWriter.writeSequenceFile(flowFile, session, getConfiguration(), compressionType);
159            session.transfer(flowFile, RELATIONSHIP_SUCCESS);
160            getLogger().info("Transferred flowfile {} to {}", new Object[]{flowFile, RELATIONSHIP_SUCCESS});
161        } catch (ProcessException e) {
162            getLogger().error("Failed to create Sequence File. Transferring {} to 'failure'", new Object[]{flowFile}, e);
163            session.transfer(flowFile, RELATIONSHIP_FAILURE);
164        }



the file name is changed before passing the flow file to the writer. The default sequence writer (and I think also the others) use the file name as received to write the key.


https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/SequenceFileWriterImpl.java


117 String key = flowFile.getAttribute(CoreAttributes.FILENAME.key()); 118 writer.append(new Text(key), inStreamWritable);



I am trying to time stamp the data as the source system does not have that capability. Suggestions around this issue are welcomed.



Best Regards.










This email is intended only for the individual or entity to which it is addressed and may contain information that is private, restricted, confidential or secret and exempt from disclosure under applicable law.
If the reader of this disclaimer is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this document is strictly prohibited. If you received this in error, please notify the sender and delete it immediately after reading this disclaimer.
Thank you.