You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by GitBox <gi...@apache.org> on 2020/11/05 20:11:41 UTC

[GitHub] [nifi] pvillard31 opened a new pull request #4648: NIFI-7981 - add support for enum type in avro schema

pvillard31 opened a new pull request #4648:
URL: https://github.com/apache/nifi/pull/4648


   Thank you for submitting a contribution to Apache NiFi.
   
   Please provide a short description of the PR here:
   
   #### Description of PR
   
   _Enables X functionality; fixes bug NIFI-YYYY._
   
   In order to streamline the review of the contribution we ask you
   to ensure the following steps have been taken:
   
   ### For all changes:
   - [ ] Is there a JIRA ticket associated with this PR? Is it referenced 
        in the commit message?
   
   - [ ] Does your PR title start with **NIFI-XXXX** where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
   
   - [ ] Has your PR been rebased against the latest commit within the target branch (typically `main`)?
   
   - [ ] Is your initial contribution a single, squashed commit? _Additional commits in response to PR reviewer feedback should be made on this branch and pushed to allow change tracking. Do not `squash` or use `--force` when pushing to allow for clean monitoring of changes._
   
   ### For code changes:
   - [ ] Have you ensured that the full suite of tests is executed via `mvn -Pcontrib-check clean install` at the root `nifi` folder?
   - [ ] Have you written or updated unit tests to verify your changes?
   - [ ] Have you verified that the full build is successful on JDK 8?
   - [ ] Have you verified that the full build is successful on JDK 11?
   - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? 
   - [ ] If applicable, have you updated the `LICENSE` file, including the main `LICENSE` file under `nifi-assembly`?
   - [ ] If applicable, have you updated the `NOTICE` file, including the main `NOTICE` file found under `nifi-assembly`?
   - [ ] If adding new Properties, have you added `.displayName` in addition to .name (programmatic access) for each of the new properties?
   
   ### For documentation related changes:
   - [ ] Have you ensured that format looks appropriate for the output in which it is rendered?
   
   ### Note:
   Please ensure that once the PR is submitted, you check GitHub Actions CI for build issues and submit an update to your PR as soon as possible.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] MikeThomsen commented on a change in pull request #4648: NIFI-7981 - add support for enum type in avro schema

Posted by GitBox <gi...@apache.org>.
MikeThomsen commented on a change in pull request #4648:
URL: https://github.com/apache/nifi/pull/4648#discussion_r518366732



##########
File path: nifi-commons/nifi-record/src/main/java/org/apache/nifi/serialization/record/type/EnumDataType.java
##########
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.serialization.record.type;
+
+import org.apache.nifi.serialization.record.DataType;
+import org.apache.nifi.serialization.record.RecordFieldType;
+
+import java.util.List;
+import java.util.Objects;
+
+public class EnumDataType extends DataType {
+
+    private final List<String> enums;

Review comment:
       Would it make more sense to call this `symbols`, which is the terminology that Avro uses?

##########
File path: nifi-commons/nifi-record/src/main/java/org/apache/nifi/serialization/record/util/DataTypeUtils.java
##########
@@ -1025,6 +1029,17 @@ public static boolean isStringTypeCompatible(final Object value) {
         return value != null;
     }
 
+    public static boolean isEnumTypeCompatible(final Object value, final EnumDataType enumType) {
+        return enumType.getEnums() != null && enumType.getEnums().contains(value);
+    }
+
+    private static Object toEnum(Object value, EnumDataType dataType, String fieldName) {
+        if(dataType.getEnums() != null && dataType.getEnums().contains(value)) {

Review comment:
       Are you assuming only conversion from strings here? Like not even allowing people to convert booleans to TRUE and FALSE?

##########
File path: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/json/JsonTreeRowRecordReader.java
##########
@@ -84,22 +84,6 @@ private JsonNode getChildNode(final JsonNode jsonNode, final RecordField field)
         return null;
     }
 
-    private boolean isDateTimeTimestampType(final RecordField field) {

Review comment:
       Was this just unused code?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] markap14 commented on pull request #4648: NIFI-7981 - add support for enum type in avro schema

Posted by GitBox <gi...@apache.org>.
markap14 commented on pull request #4648:
URL: https://github.com/apache/nifi/pull/4648#issuecomment-724105329


   So this isn't really how i was envisioning  the "enum" concept being implemented. But after thinking about it for a bit, I do think it's quite reasonable to have an Enum data type. Even if we introduce a notion of adding restrictions. So I think I'm ok with it. Thanks @pvillard31 and @MikeThomsen !


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] MikeThomsen commented on pull request #4648: NIFI-7981 - add support for enum type in avro schema

Posted by GitBox <gi...@apache.org>.
MikeThomsen commented on pull request #4648:
URL: https://github.com/apache/nifi/pull/4648#issuecomment-722700884


   Ran it with a live flow, and it worked as expected with my enum test schema (with both good and bad json inputs).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] MikeThomsen commented on a change in pull request #4648: NIFI-7981 - add support for enum type in avro schema

Posted by GitBox <gi...@apache.org>.
MikeThomsen commented on a change in pull request #4648:
URL: https://github.com/apache/nifi/pull/4648#discussion_r518407403



##########
File path: nifi-commons/nifi-record/src/main/java/org/apache/nifi/serialization/record/util/DataTypeUtils.java
##########
@@ -1025,6 +1029,17 @@ public static boolean isStringTypeCompatible(final Object value) {
         return value != null;
     }
 
+    public static boolean isEnumTypeCompatible(final Object value, final EnumDataType enumType) {
+        return enumType.getEnums() != null && enumType.getEnums().contains(value);
+    }
+
+    private static Object toEnum(Object value, EnumDataType dataType, String fieldName) {
+        if(dataType.getEnums() != null && dataType.getEnums().contains(value)) {

Review comment:
       Yeah, I know the spec only allows strings, but I was wondering if wanted to allow boolean values to be converted to strings here. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] markap14 commented on pull request #4648: NIFI-7981 - add support for enum type in avro schema

Posted by GitBox <gi...@apache.org>.
markap14 commented on pull request #4648:
URL: https://github.com/apache/nifi/pull/4648#issuecomment-722710471


   I'm personally not a fan of introducing Enum here as a datatype. I avoided introducing that as one of the data types when originally creating the Record API. What I think makes far more sense is a more general notion of restrictions of types. Consider, for example, how XSD allows you to add a restriction to a base type. We want to eventually support specifying schemas using XSD, and we'll need to support this notion in order to make that happen. I would certainly favor introducing that notion now over introducing an Enum. Then, when converting an Avro schema into a RecordSchema, we could do so as a restriction on the String type. This provides far more power and flexibility when using schemas, such as specifying that an int must be in the range of 1-100, and this then becomes extremely powerful when coupled with something like ValidateRecord.
   Adding an enum at this time, IMO, makes this a bit awkward, because it is then a specific type whose purpose is to introduce a restriction on another type (I.e., it is basically a Restriction on a String type). So I think it kind of makes things confusing if our API provides two mechanisms for specifying an enumeration: using an enum type, or specifying a restriction on a String type...


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] markap14 commented on a change in pull request #4648: NIFI-7981 - add support for enum type in avro schema

Posted by GitBox <gi...@apache.org>.
markap14 commented on a change in pull request #4648:
URL: https://github.com/apache/nifi/pull/4648#discussion_r518437335



##########
File path: nifi-commons/nifi-record/src/main/java/org/apache/nifi/serialization/record/util/DataTypeUtils.java
##########
@@ -1025,6 +1029,17 @@ public static boolean isStringTypeCompatible(final Object value) {
         return value != null;
     }
 
+    public static boolean isEnumTypeCompatible(final Object value, final EnumDataType enumType) {
+        return enumType.getEnums() != null && enumType.getEnums().contains(value);
+    }
+
+    private static Object toEnum(Object value, EnumDataType dataType, String fieldName) {
+        if(dataType.getEnums() != null && dataType.getEnums().contains(value)) {

Review comment:
       We should be very careful here - there is no spec. We absolutely do not - and should not - follow the Avro spec. We should do what makes sense for us. Personally I am not a fan of introducing Enums at all. I very intentionally didn't add those in as a data type when we created the initial set.  More on that in a general comment below :)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] asfgit closed pull request #4648: NIFI-7981 - add support for enum type in avro schema

Posted by GitBox <gi...@apache.org>.
asfgit closed pull request #4648:
URL: https://github.com/apache/nifi/pull/4648


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] MikeThomsen commented on pull request #4648: NIFI-7981 - add support for enum type in avro schema

Posted by GitBox <gi...@apache.org>.
MikeThomsen commented on pull request #4648:
URL: https://github.com/apache/nifi/pull/4648#issuecomment-722721023


   @markap14 has any work, even just planning, started on restricted types? I agree with your points, and if there are plans to proceed on such a feature set it would make sense to back this type out and plan to massage enum functionality into that.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] pvillard31 commented on a change in pull request #4648: NIFI-7981 - add support for enum type in avro schema

Posted by GitBox <gi...@apache.org>.
pvillard31 commented on a change in pull request #4648:
URL: https://github.com/apache/nifi/pull/4648#discussion_r518405814



##########
File path: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/json/JsonTreeRowRecordReader.java
##########
@@ -84,22 +84,6 @@ private JsonNode getChildNode(final JsonNode jsonNode, final RecordField field)
         return null;
     }
 
-    private boolean isDateTimeTimestampType(final RecordField field) {

Review comment:
       yes




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] pvillard31 commented on a change in pull request #4648: NIFI-7981 - add support for enum type in avro schema

Posted by GitBox <gi...@apache.org>.
pvillard31 commented on a change in pull request #4648:
URL: https://github.com/apache/nifi/pull/4648#discussion_r518405917



##########
File path: nifi-commons/nifi-record/src/main/java/org/apache/nifi/serialization/record/util/DataTypeUtils.java
##########
@@ -1025,6 +1029,17 @@ public static boolean isStringTypeCompatible(final Object value) {
         return value != null;
     }
 
+    public static boolean isEnumTypeCompatible(final Object value, final EnumDataType enumType) {
+        return enumType.getEnums() != null && enumType.getEnums().contains(value);
+    }
+
+    private static Object toEnum(Object value, EnumDataType dataType, String fieldName) {
+        if(dataType.getEnums() != null && dataType.getEnums().contains(value)) {

Review comment:
       I assumed that we can only have strings: http://avro.apache.org/docs/current/spec.html#Enums




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org