You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Arpad Boda (JIRA)" <ji...@apache.org> on 2019/04/08 10:34:00 UTC

[jira] [Commented] (MINIFICPP-726) Enhance ExtractText to have more feature parity with the Java impl

    [ https://issues.apache.org/jira/browse/MINIFICPP-726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812319#comment-16812319 ] 

Arpad Boda commented on MINIFICPP-726:
--------------------------------------

[~phrocker]: yes, some has been made, implemented the regex logic. 

Your point is absolutely fair on the I/O point. 

My initial idea for this was to provide a caching content repo: writes data to IO, but only removes data from memory in case it wasn't read for a while. This timeout could be configured, I think this feature could ensure that no I/O read happens as the data is kept in memory by the time the given flowfile goes through the flowchain. 

Doing it in write phase sounds better (more efficient in memory handling), but I wonder how can we do that while we are trying to keep compatibility with NiFi.

A NiFi-Fn-like behavior (all or nothing, only persist at the end of the flowchain) would also make sense in such cases. This would allow even modification of the content without paying huge IO costs. 

I think this topic definitely worth some brainstorming, will create a Jira to collect ideas. 

> Enhance ExtractText to have more feature parity with the Java impl
> ------------------------------------------------------------------
>
>                 Key: MINIFICPP-726
>                 URL: https://issues.apache.org/jira/browse/MINIFICPP-726
>             Project: NiFi MiNiFi C++
>          Issue Type: New Feature
>            Reporter: Aldrin Piri
>            Assignee: Arpad Boda
>            Priority: Major
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> ExtractText is limited in terms of functionality in contrast to the Java variant https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.8.0/org.apache.nifi.processors.standard.ExtractText/index.html.
> Currently, the processor only allows promoting the entirety of the content to an attribute.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)