You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Shawn Weeks (Jira)" <ji...@apache.org> on 2020/01/07 20:50:00 UTC

[jira] [Commented] (NIFI-6986) ValidateRecord should optionally validate if nullable fields are present

    [ https://issues.apache.org/jira/browse/NIFI-6986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17010083#comment-17010083 ] 

Shawn Weeks commented on NIFI-6986:
-----------------------------------

Thanks for looking into this. I think I can implement your suggestion pretty easily and it's simpler than what I came up with.

> ValidateRecord should optionally validate if nullable fields are present
> ------------------------------------------------------------------------
>
>                 Key: NIFI-6986
>                 URL: https://issues.apache.org/jira/browse/NIFI-6986
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Extensions
>            Reporter: Mark Payne
>            Priority: Major
>
> Currently, if a field is nullable according to the schema, ValidateRecord considers the record to be valid, even if the field is missing completely. For some use cases, this is desirable. For example, it is common to drop fields in JSON when the field's value is null, because it can drastically reduce the size of the JSON.
> However, in other use cases, this is not desirable. For example, in a CSV file, we may want to require that there are the appropriate number of fields in a Record. It may be acceptable, for instance to have a line like "1234, John Smith, , , ," but not to have a line like "1234, John Smith".
> ValidateRecord should be updated with a new Property: "Allow Missing Null Values". If the value is `true` (the default, to avoid changing behavior between versions), the Processor should behave as it does now, where the absence of the field is synonymous with a null value. In this case, a line like "1234, John Smith" would be valid when the CSV is expecting 6 fields, as long as the last 4 fields are nullable.
> But if the value of this new property is `false`, the Processor should require that all fields be present in the data, even if the field has a null value. In this case, a line like "1234, John Smith" would be invalid if the CSV were expected to contain 6 fields.
> The `WriteJsonResult` class has a method in it: `private boolean isFieldPresent(RecordField field, Record record)`. This method should really exist on `Record` itself with a slightly different signature: `boolean isFieldPresent(RecordField field)`. It should have a default implementation provided, akin to the implementation in `WriteJsonResult` and then `WriteJsonResult` should simply use that method.
> `StandardSchemaValidator` should then be updated to use this to validate that records have all required fields, as configured. `SchemaValidationContext` should then be updated also to indicate whether or not the presence of null values should be validated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)