You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Samrat Vilasrao Bandgar (JIRA)" <ji...@apache.org> on 2017/09/06 20:44:00 UTC

[jira] [Updated] (NIFI-4359) Enhance ConvertJSONToSQL processor to handle JSON containing fields having complex type

     [ https://issues.apache.org/jira/browse/NIFI-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Samrat Vilasrao Bandgar updated NIFI-4359:
------------------------------------------
    Description: 
Processor: ConvertJSONToSQL

*Problem statement: *

Sample JSON:
{noformat}
{
    "prop1": "value1",
    "prop2": "value2",
    "prop3": "value3",
    "prop4": {
        "prop5": "value5",
        "prop6": "value6"
    }
}
{noformat}

Sample table:
{noformat}
mysql> desc mytable;
+-------+------------+------+-----+---------+----------------+
| Field | Type       | Null | Key | Default | Extra          |
+-------+------------+------+-----+---------+----------------+
| id    | bigint(20) | NO   | PRI | NULL    | auto_increment |
| prop1 | char(30)   | NO   |     | NULL    |                |
| prop2 | char(30)   | NO   |     | NULL    |                |
| prop3 | char(30)   | NO   |     | NULL    |                |
| prop4 | text       | NO   |     | NULL    |                |
+-------+------------+------+-----+---------+----------------+
5 rows in set (0.00 sec)
{noformat}

With the above mentioned sample json and table, I want to convert the json into insert sql in such a way that prop4 column will get inserted with value {"prop5":"value5","prop6":"value6"}. However, when I use the current ConvertJSONToSQL processor, prop4 column gets inserted with empty string.

*Expected:*

{noformat}
mysql> select * from mytable;
+----+--------+--------+--------+-------------------------------------+
| id | prop1  | prop2  | prop3  | prop4                               |
+----+--------+--------+--------+-------------------------------------+
|  1 | value1 | value2 | value3 | {"prop5":"value5","prop6":"value6"} |
+----+--------+--------+--------+-------------------------------------+
1 row in set (0.00 sec)
{noformat}

*Actual:*

{noformat}
mysql> select * from mytable;
+----+--------+--------+--------+----------+
| id | prop1  | prop2  | prop3  | prop4    |
+----+--------+--------+--------+----------+
|  1 | value1 | value2 | value3 |          |
+----+--------+--------+--------+----------+
1 row in set (0.00 sec)
{noformat}

*Attributes details captured from Provenance Event UI for the above use case are:*

sql.args.1.type
1
sql.args.1.value
value1

sql.args.2.type
1
sql.args.2.value
value2

sql.args.3.type
1
sql.args.3.value
value3

sql.args.4.type
-1
sql.args.4.value
{color:red}Empty string set{color}

sql.table
mytable

The ConvertJSONToSQL.java has a method createSqlStringValue(final JsonNode fieldNode, final Integer colSize, final int sqlType) which is responsible for populating attribute values for each column. This method uses below line to get field value.

{code:java}
String fieldValue = fieldNode.asText();
{code}

Documentation for org.codehaus.jackson.JsonNode.asText() method tells us:

*asText()*

Method that will return valid String representation of the container value, if the node is a value node (method isValueNode() returns true), otherwise empty String.

Since prop4 in this case is not a value node, empty string is returned and get set to attribute value for the column prop4.

Suggested improvement is as below. 

If the fieldNode is value node, use asText() else use toString() with StringEscapeUtils.escapeSql() to take of characters like quotes in insert query. I have tested this locally. Please let me know if it makes sense to add this improvement. I will attach the patch file for code changes.

Thanks

  was:
Processor: ConvertJSONToSQL

*Problem statement: *

Sample JSON:
{
    "prop1": "value1",
    "prop2": "value2",
    "prop3": "value3",
    "prop4": {
        "prop5": "value5",
        "prop6": "value6"
    }
}

Sample table:
{noformat}
mysql> desc mytable;
+-------+------------+------+-----+---------+----------------+
| Field | Type       | Null | Key | Default | Extra          |
+-------+------------+------+-----+---------+----------------+
| id    | bigint(20) | NO   | PRI | NULL    | auto_increment |
| prop1 | char(30)   | NO   |     | NULL    |                |
| prop2 | char(30)   | NO   |     | NULL    |                |
| prop3 | char(30)   | NO   |     | NULL    |                |
| prop4 | text       | NO   |     | NULL    |                |
+-------+------------+------+-----+---------+----------------+
5 rows in set (0.00 sec)
{noformat}

With the above mentioned sample json and table, I want to convert the json into insert sql in such a way that prop4 column will get inserted with value {"prop5":"value5","prop6":"value6"}. However, when I use the current ConvertJSONToSQL processor, prop4 column gets inserted with empty string.

*Expected:*

{noformat}
mysql> select * from mytable;
+----+--------+--------+--------+-------------------------------------+
| id | prop1  | prop2  | prop3  | prop4                               |
+----+--------+--------+--------+-------------------------------------+
|  1 | value1 | value2 | value3 | {"prop5":"value5","prop6":"value6"} |
+----+--------+--------+--------+-------------------------------------+
1 row in set (0.00 sec)
{noformat}

*Actual:*

{noformat}
mysql> select * from mytable;
+----+--------+--------+--------+----------+
| id | prop1  | prop2  | prop3  | prop4    |
+----+--------+--------+--------+----------+
|  1 | value1 | value2 | value3 |          |
+----+--------+--------+--------+----------+
1 row in set (0.00 sec)
{noformat}

*Attributes details captured from Provenance Event UI for the above use case are:*

sql.args.1.type
1
sql.args.1.value
value1

sql.args.2.type
1
sql.args.2.value
value2

sql.args.3.type
1
sql.args.3.value
value3

sql.args.4.type
-1
sql.args.4.value
{color:red}Empty string set{color}

sql.table
mytable

The ConvertJSONToSQL.java has a method createSqlStringValue(final JsonNode fieldNode, final Integer colSize, final int sqlType) which is responsible for populating attribute values for each column. This method uses below line to get field value.

{code:java}
String fieldValue = fieldNode.asText();
{code}

Documentation for org.codehaus.jackson.JsonNode.asText() method tells us:

*asText()*

Method that will return valid String representation of the container value, if the node is a value node (method isValueNode() returns true), otherwise empty String.

Since prop4 in this case is not a value node, empty string is returned and get set to attribute value for the column prop4.

Suggested improvement is as below. 

If the fieldNode is value node, use asText() else use toString() with StringEscapeUtils.escapeSql() to take of characters like quotes in insert query. I have tested this locally. Please let me know if it makes sense to add this improvement. I will attach the patch file for code changes.

Thanks


> Enhance ConvertJSONToSQL processor to handle JSON containing fields having complex type
> ---------------------------------------------------------------------------------------
>
>                 Key: NIFI-4359
>                 URL: https://issues.apache.org/jira/browse/NIFI-4359
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Extensions
>            Reporter: Samrat Vilasrao Bandgar
>
> Processor: ConvertJSONToSQL
> *Problem statement: *
> Sample JSON:
> {noformat}
> {
>     "prop1": "value1",
>     "prop2": "value2",
>     "prop3": "value3",
>     "prop4": {
>         "prop5": "value5",
>         "prop6": "value6"
>     }
> }
> {noformat}
> Sample table:
> {noformat}
> mysql> desc mytable;
> +-------+------------+------+-----+---------+----------------+
> | Field | Type       | Null | Key | Default | Extra          |
> +-------+------------+------+-----+---------+----------------+
> | id    | bigint(20) | NO   | PRI | NULL    | auto_increment |
> | prop1 | char(30)   | NO   |     | NULL    |                |
> | prop2 | char(30)   | NO   |     | NULL    |                |
> | prop3 | char(30)   | NO   |     | NULL    |                |
> | prop4 | text       | NO   |     | NULL    |                |
> +-------+------------+------+-----+---------+----------------+
> 5 rows in set (0.00 sec)
> {noformat}
> With the above mentioned sample json and table, I want to convert the json into insert sql in such a way that prop4 column will get inserted with value {"prop5":"value5","prop6":"value6"}. However, when I use the current ConvertJSONToSQL processor, prop4 column gets inserted with empty string.
> *Expected:*
> {noformat}
> mysql> select * from mytable;
> +----+--------+--------+--------+-------------------------------------+
> | id | prop1  | prop2  | prop3  | prop4                               |
> +----+--------+--------+--------+-------------------------------------+
> |  1 | value1 | value2 | value3 | {"prop5":"value5","prop6":"value6"} |
> +----+--------+--------+--------+-------------------------------------+
> 1 row in set (0.00 sec)
> {noformat}
> *Actual:*
> {noformat}
> mysql> select * from mytable;
> +----+--------+--------+--------+----------+
> | id | prop1  | prop2  | prop3  | prop4    |
> +----+--------+--------+--------+----------+
> |  1 | value1 | value2 | value3 |          |
> +----+--------+--------+--------+----------+
> 1 row in set (0.00 sec)
> {noformat}
> *Attributes details captured from Provenance Event UI for the above use case are:*
> sql.args.1.type
> 1
> sql.args.1.value
> value1
> sql.args.2.type
> 1
> sql.args.2.value
> value2
> sql.args.3.type
> 1
> sql.args.3.value
> value3
> sql.args.4.type
> -1
> sql.args.4.value
> {color:red}Empty string set{color}
> sql.table
> mytable
> The ConvertJSONToSQL.java has a method createSqlStringValue(final JsonNode fieldNode, final Integer colSize, final int sqlType) which is responsible for populating attribute values for each column. This method uses below line to get field value.
> {code:java}
> String fieldValue = fieldNode.asText();
> {code}
> Documentation for org.codehaus.jackson.JsonNode.asText() method tells us:
> *asText()*
> Method that will return valid String representation of the container value, if the node is a value node (method isValueNode() returns true), otherwise empty String.
> Since prop4 in this case is not a value node, empty string is returned and get set to attribute value for the column prop4.
> Suggested improvement is as below. 
> If the fieldNode is value node, use asText() else use toString() with StringEscapeUtils.escapeSql() to take of characters like quotes in insert query. I have tested this locally. Please let me know if it makes sense to add this improvement. I will attach the patch file for code changes.
> Thanks



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)