You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by bbende <gi...@git.apache.org> on 2018/03/05 15:50:27 UTC

[GitHub] nifi issue #2478: NIFI-4833 Add scanHBase Processor

Github user bbende commented on the issue:

    https://github.com/apache/nifi/pull/2478
  
    @bdesert Thanks for the updates, was reviewing the code again and I think we need to change to way the `ScanHBaseResultHandler` works...
    
    Currently it adds rows to a list in memory until bulk size is reached, and since bulk size defaults to 0, the default case will be that bulk size is never reached and all the rows are left as "hanging" rows. This means if someone scans a table with 1 million rows, all 1 millions will be in memory before being written to the flow file which would not be good for memory usage.
    
    We should be able to write row by row to the flow file and never add them to a list. Inside the handler we can use `session.append(flowFile, (out) ->` to append a row at a time to the flow file. I think we can then do away with the "hanging rows" concept because there won't be anything buffered in memory.


---