You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nifi.apache.org by "Joseph Percivall (JIRA)" <ji...@apache.org> on 2015/10/27 22:19:27 UTC

[jira] [Created] (NIFI-1077) Allow ConvertCharacterSet to accept expression language

Joseph Percivall created NIFI-1077:
--------------------------------------

             Summary: Allow ConvertCharacterSet to accept expression language
                 Key: NIFI-1077
                 URL: https://issues.apache.org/jira/browse/NIFI-1077
             Project: Apache NiFi
          Issue Type: Improvement
            Reporter: Joseph Percivall
            Assignee: Joseph Percivall
            Priority: Minor


This issue arose from a user on the mailing list. It demonstrates the need to be able to use expression language to set the incoming (and potentially outgoing) character sets:

I'm looking to process many files into common formats.  The source files are coming in various character sets, mime types, and new line terminators.

My thinking for a data flow was along these lines:

GetFile (from many sub directories) -> 
ExecuteStreamCommand (file -i) ->
ConvertCharacterSet (from previous command to utf8) ->
ReplaceText (to change any \r\n into \n) ->
PutFile (into a directory structure based on values found in the original file path and filename)

Additional steps would be added for archiving a copy of the original, converting xml files, etc.

Attempting to process these with Nifi leaves me confused as to how to process within the tool.  If I want to ConvertCharacterSet, I have to know the input type.  I setup a ExecuteStreamCommand to file -i ${absolute.path:append(${filename})} which returned the expected values.  I don't see a way to turn these results into input for the processor, which doesn't accept expression language for that field.

I also considered ConvertCSVToAvro as an interim step but notice the same issue.  Any suggestions what this dataflow should look like?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)