You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by mattyb149 <gi...@git.apache.org> on 2018/08/06 17:35:56 UTC

[GitHub] nifi pull request #2938: NIFI-5491: Fixed PutHive3Streaming handling of Byte...

GitHub user mattyb149 opened a pull request:

    https://github.com/apache/nifi/pull/2938

    NIFI-5491: Fixed PutHive3Streaming handling of Byte, Short, and Struct

    Thank you for submitting a contribution to Apache NiFi.
    
    In order to streamline the review of the contribution we ask you
    to ensure the following steps have been taken:
    
    ### For all changes:
    - [x] Is there a JIRA ticket associated with this PR? Is it referenced 
         in the commit message?
    
    - [x] Does your PR title start with NIFI-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
    
    - [x] Has your PR been rebased against the latest commit within the target branch (typically master)?
    
    - [x] Is your initial contribution a single, squashed commit?
    
    ### For code changes:
    - [x] Have you ensured that the full suite of tests is executed via mvn -Pcontrib-check clean install at the root nifi folder?
    - [x] Have you written or updated unit tests to verify your changes?
    - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? 
    - [ ] If applicable, have you updated the LICENSE file, including the main LICENSE file under nifi-assembly?
    - [ ] If applicable, have you updated the NOTICE file, including the main NOTICE file found under nifi-assembly?
    - [ ] If adding new Properties, have you added .displayName in addition to .name (programmatic access) for each of the new properties?
    
    ### For documentation related changes:
    - [ ] Have you ensured that format looks appropriate for the output in which it is rendered?
    
    ### Note:
    Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mattyb149/nifi NIFI-5491

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/nifi/pull/2938.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2938
    
----
commit 68f83ac1e2987e008c6b5ef7d5fcf6e18cc21c0c
Author: Matthew Burgess <ma...@...>
Date:   2018-08-06T17:34:39Z

    NIFI-5491: Fixed PutHive3Streaming handling of Byte, Short, and Struct

----


---

[GitHub] nifi pull request #2938: NIFI-5491: Fixed PutHive3Streaming handling of Byte...

Posted by markap14 <gi...@git.apache.org>.
Github user markap14 commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2938#discussion_r207986292
  
    --- Diff: nifi-nar-bundles/nifi-hive-bundle/nifi-hive3-processors/src/main/java/org/apache/hive/streaming/NiFiRecordSerDe.java ---
    @@ -227,8 +236,32 @@ private Object extractCurrentField(Record record, RecordField field, TypeInfo fi
                     val = DataTypeUtils.convertRecordFieldtoObject(record.getValue(fieldName), field.getDataType());
                     break;
                 case STRUCT:
    -                val = DataTypeUtils.convertRecordFieldtoObject(record.getValue(fieldName), field.getDataType());
    -                break;
    +                // For some reason the Hive StandardStructObjectInspector expects the object corresponding to a "struct" to be an array or List rather than a Map.
    +                // Do the conversion here, calling extractCurrentField recursively to traverse any nested structs.
    +                Record r = (Record) record.getValue(fieldName);
    +                if (r == null) {
    +                    return null;
    +                }
    +                try {
    +                    RecordSchema recordSchema = r.getSchema();
    +                    List<RecordField> recordFields = recordSchema.getFields();
    +                    if (recordFields == null || recordFields.isEmpty()) {
    +                        return new ArrayList<>(0);
    +                    }
    +                    // This List will hold the values of the entries in the Map
    +                    List<Object> structList = new ArrayList<>(recordFields.size());
    +                    StructTypeInfo typeInfo = (StructTypeInfo) schema.getStructFieldTypeInfo(fieldName);
    +                    for (RecordField f : recordFields) {
    +                        String fName = f.getFieldName();
    +                        String normalizedFieldName = fName.toLowerCase();
    +                        structList.add(extractCurrentField(r, f, typeInfo.getStructFieldTypeInfo(normalizedFieldName)));
    +                    }
    +                    return structList;
    +                } catch (Exception e) {
    +                    log.warn("Error [{}] parsing Record [{}].", new Object[]{e.getLocalizedMessage(), r}, e);
    --- End diff --
    
    I would generally recommend using e.toString() instead of e.getLocalizedMessage() simply because many Exceptions don't include a message text, such as IllegalArgumentException, etc., so the message in that case comes across as "Error [] parsing Record..." or "Error [null] parsing Record..." which is confusing.


---

[GitHub] nifi pull request #2938: NIFI-5491: Fixed PutHive3Streaming handling of Byte...

Posted by markap14 <gi...@git.apache.org>.
Github user markap14 commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2938#discussion_r207985525
  
    --- Diff: nifi-nar-bundles/nifi-hive-bundle/nifi-hive3-processors/src/main/java/org/apache/hive/streaming/NiFiRecordSerDe.java ---
    @@ -227,8 +236,32 @@ private Object extractCurrentField(Record record, RecordField field, TypeInfo fi
                     val = DataTypeUtils.convertRecordFieldtoObject(record.getValue(fieldName), field.getDataType());
                     break;
                 case STRUCT:
    -                val = DataTypeUtils.convertRecordFieldtoObject(record.getValue(fieldName), field.getDataType());
    -                break;
    +                // For some reason the Hive StandardStructObjectInspector expects the object corresponding to a "struct" to be an array or List rather than a Map.
    +                // Do the conversion here, calling extractCurrentField recursively to traverse any nested structs.
    +                Record r = (Record) record.getValue(fieldName);
    --- End diff --
    
    We typically like to avoid single-letter variable names :)


---

[GitHub] nifi pull request #2938: NIFI-5491: Fixed PutHive3Streaming handling of Byte...

Posted by markap14 <gi...@git.apache.org>.
Github user markap14 commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2938#discussion_r207981236
  
    --- Diff: nifi-nar-bundles/nifi-hive-bundle/nifi-hive3-processors/src/main/java/org/apache/hive/streaming/NiFiRecordSerDe.java ---
    @@ -227,8 +236,32 @@ private Object extractCurrentField(Record record, RecordField field, TypeInfo fi
                     val = DataTypeUtils.convertRecordFieldtoObject(record.getValue(fieldName), field.getDataType());
                     break;
                 case STRUCT:
    -                val = DataTypeUtils.convertRecordFieldtoObject(record.getValue(fieldName), field.getDataType());
    -                break;
    +                // For some reason the Hive StandardStructObjectInspector expects the object corresponding to a "struct" to be an array or List rather than a Map.
    +                // Do the conversion here, calling extractCurrentField recursively to traverse any nested structs.
    +                Record r = (Record) record.getValue(fieldName);
    +                if (r == null) {
    +                    return null;
    +                }
    +                try {
    +                    RecordSchema recordSchema = r.getSchema();
    +                    List<RecordField> recordFields = recordSchema.getFields();
    +                    if (recordFields == null || recordFields.isEmpty()) {
    +                        return new ArrayList<>(0);
    --- End diff --
    
    should just return Collections.emptyList()


---

[GitHub] nifi pull request #2938: NIFI-5491: Fixed PutHive3Streaming handling of Byte...

Posted by markap14 <gi...@git.apache.org>.
Github user markap14 commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2938#discussion_r207985881
  
    --- Diff: nifi-nar-bundles/nifi-hive-bundle/nifi-hive3-processors/src/main/java/org/apache/hive/streaming/NiFiRecordSerDe.java ---
    @@ -205,7 +212,9 @@ private Object extractCurrentField(Record record, RecordField field, TypeInfo fi
                             val = record.getAsString(fieldName);
                             break;
                         case BINARY:
    -                        val = AvroTypeUtil.convertByteArray(record.getAsArray(fieldName)).array();
    +                        Object[] array = record.getAsArray(fieldName);
    +                        if (array == null) return null;
    --- End diff --
    
    I'm surprised this passes checkstyle. I thought we required the {} with the body to start on the next line... is preferred syntax anyway I think.


---

[GitHub] nifi pull request #2938: NIFI-5491: Fixed PutHive3Streaming handling of Byte...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/nifi/pull/2938


---