You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@nifi.apache.org by "Mark Payne (JIRA)" <ji...@apache.org> on 2015/02/02 22:10:34 UTC

[jira] [Commented] (NIFI-293) Add a JDBC Processor for executing arbitrary SQL queries

    [ https://issues.apache.org/jira/browse/NIFI-293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14301936#comment-14301936 ] 

Mark Payne commented on NIFI-293:
---------------------------------

There are a lot of things to consider here for the design of this.

General outline for how I would implement this... others feel free to chime in if this isn't the right design. We'll iterate until the design is right.

ExecuteSQL processor:
Takes FlowFile in. Executes an arbitrary SQL query.
2 Relationships: success, failure
Properties include:
  Database URL
  Query (supports Expression Language)
  Username
  Password (Sensitive property)
  Connection Timeout
  Serialization Strategy:
      - FlowFile attributes (will route to failure if multiple results returned)
      - XML
      - JSON

Because the SQL could reference attributes that came from some outside entity (added from an HTTP Header on receipt, perhaps) - and for performance concerns in general, the Query should ideally be compiled as a PreparedStatement, perhaps allowing for "?" in the Query and allowing user-defined properties where the name of the property is the index to use for the PreparedStatement (i.e., Query could be "SELECT * FROM Orders where OrderId=?" and then have a user-defined property with name "1" and value "${order.id}").




> Add a JDBC Processor for executing arbitrary SQL queries
> --------------------------------------------------------
>
>                 Key: NIFI-293
>                 URL: https://issues.apache.org/jira/browse/NIFI-293
>             Project: Apache NiFi
>          Issue Type: New Feature
>            Reporter: Ricky Saltzer
>
> This could be very useful for a variety of tasks, such as updating a value in a PostgreSQL table, or adding a new partition to Hive. 
> Ideally, SQL commands could be generated using the NiFi expression language using FlowFile attributes. 
> The processor should as generic as possible so that any of the popular JDBC drivers can be used (e.g. PostgreSQL, Hive, Impala). 
> I'm still new to how processors are architected, but it seems that using a pre-defined service in the _services.xml_ file (like the distributed map cache) would be the most efficient way to share a connection pool across multiple JDBC processors. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)