You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by WilliamNouet <gi...@git.apache.org> on 2017/03/08 20:55:39 UTC

[GitHub] nifi pull request #1576: NIFI-3518 Create a Morphlines processor

GitHub user WilliamNouet opened a pull request:

    https://github.com/apache/nifi/pull/1576

    NIFI-3518 Create a Morphlines processor

    Thank you for submitting a contribution to Apache NiFi.
    
    In order to streamline the review of the contribution we ask you
    to ensure the following steps have been taken:
    
    For all changes:
    
    [Y] Is there a JIRA ticket associated with this PR? Is it referenced
    in the commit message?
    
    [Y] Does your PR title start with NIFI-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
    
    [Y] Has your PR been rebased against the latest commit within the target branch (typically master)?
    
    [Y] Is your initial contribution a single, squashed commit?
    
    For code changes:
    
    [Y] Have you ensured that the full suite of tests is executed via mvn -Pcontrib-check clean install at the root nifi folder?
    [Y] Have you written or updated unit tests to verify your changes?
    [N/A] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
    [N/A] If applicable, have you updated the LICENSE file, including the main LICENSE file under nifi-assembly?
    [N/A] If applicable, have you updated the NOTICE file, including the main NOTICE file found under nifi-assembly?
    [N/A] If adding new Properties, have you added .displayName in addition to .name (programmatic access) for each of the new properties?
    For documentation related changes:
    
    [N/A] Have you ensured that format looks appropriate for the output in which it is rendered?
    Note:
    
    Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/WilliamNouet/nifi NIFI-3518-8

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/nifi/pull/1576.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1576
    
----
commit 80cbc2a2fcd22da9fcb23bfb40c2183df6b5bbfb
Author: WilliamNouet <wi...@berkeley.edu>
Date:   2017-03-08T19:10:36Z

    Add Morphlines processor

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] nifi pull request #1576: NIFI-3518 Create a Morphlines processor

Posted by JPercivall <gi...@git.apache.org>.
Github user JPercivall commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/1576#discussion_r110553085
  
    --- Diff: nifi-nar-bundles/nifi-morphlines-bundle/nifi-morphlines-processors/src/main/java/org/apache/nifi/processors/morphlines/ImplementMorphlines.java ---
    @@ -0,0 +1,219 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.nifi.processors.morphlines;
    +
    +import com.google.common.base.Preconditions;
    +import org.apache.nifi.components.PropertyDescriptor;
    +import org.apache.nifi.components.PropertyValue;
    +import org.apache.nifi.flowfile.FlowFile;
    +import org.apache.nifi.processor.*;
    +import org.apache.nifi.annotation.documentation.CapabilityDescription;
    +import org.apache.nifi.annotation.documentation.Tags;
    +import org.apache.nifi.annotation.lifecycle.OnScheduled;
    +import org.apache.nifi.processor.exception.ProcessException;
    +import org.apache.nifi.processor.io.InputStreamCallback;
    +import org.apache.nifi.processor.io.OutputStreamCallback;
    +import org.apache.nifi.processor.io.StreamCallback;
    +import org.apache.nifi.processor.util.StandardValidators;
    +import org.apache.nifi.stream.io.StreamUtils;
    +import org.kitesdk.morphline.api.Command;
    +import org.kitesdk.morphline.api.MorphlineContext;
    +import org.kitesdk.morphline.api.Record;
    +import org.kitesdk.morphline.base.Fields;
    +
    +import org.kitesdk.morphline.base.Notifications;
    +import com.google.common.collect.ImmutableList;
    +import com.google.common.collect.ImmutableSet;
    +import org.apache.nifi.annotation.lifecycle.OnScheduled;
    +
    +import java.io.File;
    +import java.io.IOException;
    +import java.io.InputStream;
    +import java.io.OutputStream;
    +import java.util.*;
    +import java.util.stream.*;
    +import java.util.concurrent.atomic.*;
    +import org.apache.nifi.annotation.lifecycle.OnStopped;
    +import org.apache.nifi.processor.exception.*;
    +
    +@Tags({"kitesdk", "morphlines", "ETL", "HDFS", "avro", "Solr", "HBase"})
    +@CapabilityDescription("Implements Morphlines (http://kitesdk.org/docs/1.1.0/morphlines/) framework, which performs in-memory container of transformation commands in oder to perform tasks such as loading, parsing, transforming, or otherwise processing a single record.")
    +public class MorphlinesProcessor extends AbstractProcessor {
    --- End diff --
    
    Please change the file name or classname in order to match. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] nifi pull request #1576: NIFI-3518 Create a Morphlines processor

Posted by JPercivall <gi...@git.apache.org>.
Github user JPercivall commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/1576#discussion_r110553363
  
    --- Diff: nifi-nar-bundles/nifi-morphlines-bundle/nifi-morphlines-processors/pom.xml ---
    @@ -0,0 +1,110 @@
    +<?xml version="1.0" encoding="UTF-8"?>
    +<!--
    +  Licensed to the Apache Software Foundation (ASF) under one or more
    +  contributor license agreements. See the NOTICE file distributed with
    +  this work for additional information regarding copyright ownership.
    +  The ASF licenses this file to You under the Apache License, Version 2.0
    +  (the "License"); you may not use this file except in compliance with
    +  the License. You may obtain a copy of the License at
    +  http://www.apache.org/licenses/LICENSE-2.0
    +  Unless required by applicable law or agreed to in writing, software
    +  distributed under the License is distributed on an "AS IS" BASIS,
    +  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +  See the License for the specific language governing permissions and
    +  limitations under the License.
    +-->
    +<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    +    <modelVersion>4.0.0</modelVersion>
    +  
    +    <parent>
    +        <groupId>org.apache.nifi</groupId>
    +        <artifactId>nifi-morphlines-bundle</artifactId>
    +        <version>1.2.0-SNAPSHOT</version>
    +    </parent>
    +
    +    <artifactId>nifi-morphlines-processors</artifactId>
    +    <packaging>jar</packaging>
    +
    +    <dependencies>
    +        <dependency>
    +            <groupId>org.apache.nifi</groupId>
    +            <artifactId>nifi-api</artifactId>
    +        </dependency>
    +        <dependency>
    +            <groupId>org.apache.nifi</groupId>
    +            <artifactId>nifi-processor-utils</artifactId>
    +        </dependency>
    +        <dependency>
    +            <groupId>org.apache.nifi</groupId>
    +            <artifactId>nifi-mock</artifactId>
    +            <scope>test</scope>
    +        </dependency>
    +        <dependency>
    +            <groupId>org.slf4j</groupId>
    +            <artifactId>slf4j-simple</artifactId>
    +            <scope>test</scope>
    +        </dependency>
    +        <dependency>
    +            <groupId>junit</groupId>
    +            <artifactId>junit</artifactId>
    +            <version>4.11</version>
    +            <scope>test</scope>
    +        </dependency>
    +        <dependency>
    +            <groupId>org.kitesdk</groupId>
    +            <artifactId>kite-morphlines-core</artifactId>
    +        </dependency>
    +        <dependency>
    +            <groupId>org.kitesdk</groupId>
    +            <artifactId>kite-morphlines-avro</artifactId>
    +        </dependency>
    +        <dependency>
    +            <groupId>org.kitesdk</groupId>
    +            <artifactId>kite-morphlines-json</artifactId>
    +        </dependency>
    +        <dependency>
    +            <groupId>org.kitesdk</groupId>
    +            <artifactId>kite-morphlines-saxon</artifactId>
    +        </dependency>
    +        <dependency>
    +            <groupId>org.kitesdk</groupId>
    +            <artifactId>kite-morphlines-hadoop-core</artifactId>
    +        </dependency>
    +        <dependency>
    +            <groupId>org.kitesdk</groupId>
    +            <artifactId>kite-morphlines-hadoop-parquet-avro</artifactId>
    +        </dependency>
    +        <dependency>
    +            <groupId>org.kitesdk</groupId>
    +            <artifactId>kite-morphlines-hadoop-sequencefile</artifactId>
    +        </dependency>
    +        <dependency>
    +            <groupId>org.kitesdk</groupId>
    +            <artifactId>kite-morphlines-hadoop-rcfile</artifactId>
    +        </dependency>
    +        <dependency>
    +            <groupId>org.kitesdk</groupId>
    +            <artifactId>kite-morphlines-tika-core</artifactId>
    +        </dependency>
    +        <dependency>
    +            <groupId>org.kitesdk</groupId>
    +            <artifactId>kite-morphlines-tika-decompress</artifactId>
    +        </dependency>
    +        <dependency>
    +            <groupId>org.kitesdk</groupId>
    +            <artifactId>kite-morphlines-twitter</artifactId>
    +        </dependency>
    +        <dependency>
    +            <groupId>org.kitesdk</groupId>
    +            <artifactId>kite-morphlines-maxmind</artifactId>
    +        </dependency>
    +        <dependency>
    +            <groupId>org.kitesdk</groupId>
    +            <artifactId>kite-morphlines-metrics-servlets</artifactId>
    +        </dependency>
    +        <dependency>
    +            <groupId>org.kitesdk</groupId>
    +            <artifactId>kite-morphlines-useragent</artifactId>
    --- End diff --
    
    I'm a bit worried that importing all of these dependencies is a too broad in scope. From my understanding not all of them are inherently needed but instead represent a toolbox of things that could potentially be used. 
    
    I'd much prefer we limit this to a core group of functionality that we can't already do in NiFi instead.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] nifi pull request #1576: NIFI-3518 Create a Morphlines processor

Posted by WilliamNouet <gi...@git.apache.org>.
Github user WilliamNouet closed the pull request at:

    https://github.com/apache/nifi/pull/1576


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] nifi pull request #1576: NIFI-3518 Create a Morphlines processor

Posted by JPercivall <gi...@git.apache.org>.
Github user JPercivall commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/1576#discussion_r110552680
  
    --- Diff: nifi-nar-bundles/nifi-morphlines-bundle/pom.xml ---
    @@ -0,0 +1,35 @@
    +<?xml version="1.0" encoding="UTF-8"?>
    +<!--
    +  Licensed to the Apache Software Foundation (ASF) under one or more
    +  contributor license agreements. See the NOTICE file distributed with
    +  this work for additional information regarding copyright ownership.
    +  The ASF licenses this file to You under the Apache License, Version 2.0
    +  (the "License"); you may not use this file except in compliance with
    +  the License. You may obtain a copy of the License at
    +  http://www.apache.org/licenses/LICENSE-2.0
    +  Unless required by applicable law or agreed to in writing, software
    +  distributed under the License is distributed on an "AS IS" BASIS,
    +  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +  See the License for the specific language governing permissions and
    +  limitations under the License.
    +-->
    +<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    +    <modelVersion>4.0.0</modelVersion>
    +
    +    <parent>
    +        <groupId>org.apache.nifi</groupId>
    +        <artifactId>nifi-nar-bundles</artifactId>
    +        <version>1.1.1</version>
    --- End diff --
    
    You didn't address this comment in the previous PR[1]. This version should be 1.2.0-SNAPSHOT to match the others.
    
    [1] https://github.com/apache/nifi/pull/1529#discussion_r102965258


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] nifi issue #1576: NIFI-3518 Create a Morphlines processor

Posted by WilliamNouet <gi...@git.apache.org>.
Github user WilliamNouet commented on the issue:

    https://github.com/apache/nifi/pull/1576
  
    @JPercivall @trixpan Any update on this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] nifi pull request #1576: NIFI-3518 Create a Morphlines processor

Posted by JPercivall <gi...@git.apache.org>.
Github user JPercivall commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/1576#discussion_r110552730
  
    --- Diff: nifi-nar-bundles/nifi-morphlines-bundle/nifi-morphlines-processors/pom.xml ---
    @@ -0,0 +1,110 @@
    +<?xml version="1.0" encoding="UTF-8"?>
    +<!--
    +  Licensed to the Apache Software Foundation (ASF) under one or more
    +  contributor license agreements. See the NOTICE file distributed with
    +  this work for additional information regarding copyright ownership.
    +  The ASF licenses this file to You under the Apache License, Version 2.0
    +  (the "License"); you may not use this file except in compliance with
    +  the License. You may obtain a copy of the License at
    +  http://www.apache.org/licenses/LICENSE-2.0
    +  Unless required by applicable law or agreed to in writing, software
    +  distributed under the License is distributed on an "AS IS" BASIS,
    +  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +  See the License for the specific language governing permissions and
    +  limitations under the License.
    +-->
    +<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    +    <modelVersion>4.0.0</modelVersion>
    +  
    +    <parent>
    +        <groupId>org.apache.nifi</groupId>
    +        <artifactId>nifi-morphlines-bundle</artifactId>
    +        <version>1.2.0-SNAPSHOT</version>
    +    </parent>
    +
    +    <artifactId>nifi-morphlines-processors</artifactId>
    +    <packaging>jar</packaging>
    +
    +    <dependencies>
    +        <dependency>
    +            <groupId>org.apache.nifi</groupId>
    +            <artifactId>nifi-api</artifactId>
    +        </dependency>
    +        <dependency>
    +            <groupId>org.apache.nifi</groupId>
    +            <artifactId>nifi-processor-utils</artifactId>
    +        </dependency>
    +        <dependency>
    +            <groupId>org.apache.nifi</groupId>
    +            <artifactId>nifi-mock</artifactId>
    +            <scope>test</scope>
    +        </dependency>
    +        <dependency>
    +            <groupId>org.slf4j</groupId>
    +            <artifactId>slf4j-simple</artifactId>
    +            <scope>test</scope>
    +        </dependency>
    +        <dependency>
    +            <groupId>junit</groupId>
    +            <artifactId>junit</artifactId>
    +            <version>4.11</version>
    +            <scope>test</scope>
    +        </dependency>
    +        <dependency>
    +            <groupId>org.kitesdk</groupId>
    +            <artifactId>kite-morphlines-core</artifactId>
    --- End diff --
    
    All of these dependencies are missing versions. I can't build it at all.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] nifi pull request #1576: NIFI-3518 Create a Morphlines processor

Posted by JPercivall <gi...@git.apache.org>.
Github user JPercivall commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/1576#discussion_r110553180
  
    --- Diff: nifi-nar-bundles/nifi-morphlines-bundle/nifi-morphlines-processors/src/main/java/org/apache/nifi/processors/morphlines/ImplementMorphlines.java ---
    @@ -0,0 +1,219 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.nifi.processors.morphlines;
    +
    +import com.google.common.base.Preconditions;
    +import org.apache.nifi.components.PropertyDescriptor;
    +import org.apache.nifi.components.PropertyValue;
    +import org.apache.nifi.flowfile.FlowFile;
    +import org.apache.nifi.processor.*;
    +import org.apache.nifi.annotation.documentation.CapabilityDescription;
    +import org.apache.nifi.annotation.documentation.Tags;
    +import org.apache.nifi.annotation.lifecycle.OnScheduled;
    +import org.apache.nifi.processor.exception.ProcessException;
    +import org.apache.nifi.processor.io.InputStreamCallback;
    +import org.apache.nifi.processor.io.OutputStreamCallback;
    +import org.apache.nifi.processor.io.StreamCallback;
    +import org.apache.nifi.processor.util.StandardValidators;
    +import org.apache.nifi.stream.io.StreamUtils;
    +import org.kitesdk.morphline.api.Command;
    +import org.kitesdk.morphline.api.MorphlineContext;
    +import org.kitesdk.morphline.api.Record;
    +import org.kitesdk.morphline.base.Fields;
    +
    +import org.kitesdk.morphline.base.Notifications;
    +import com.google.common.collect.ImmutableList;
    +import com.google.common.collect.ImmutableSet;
    +import org.apache.nifi.annotation.lifecycle.OnScheduled;
    +
    +import java.io.File;
    +import java.io.IOException;
    +import java.io.InputStream;
    +import java.io.OutputStream;
    +import java.util.*;
    +import java.util.stream.*;
    +import java.util.concurrent.atomic.*;
    +import org.apache.nifi.annotation.lifecycle.OnStopped;
    +import org.apache.nifi.processor.exception.*;
    +
    +@Tags({"kitesdk", "morphlines", "ETL", "HDFS", "avro", "Solr", "HBase"})
    +@CapabilityDescription("Implements Morphlines (http://kitesdk.org/docs/1.1.0/morphlines/) framework, which performs in-memory container of transformation commands in oder to perform tasks such as loading, parsing, transforming, or otherwise processing a single record.")
    +public class MorphlinesProcessor extends AbstractProcessor {
    +
    +    private Command morphline;
    +    private volatile Record record = new Record();
    +    private volatile Collector collector = new Collector();
    +
    +    public static final PropertyDescriptor MORPHLINES_ID = new PropertyDescriptor
    +            .Builder().name("Morphlines ID")
    +            .description("Identifier of the morphlines context")
    +            .required(true)
    +            .addValidator(StandardValidators.NON_EMPTY_VALIDATOR)
    +            .expressionLanguageSupported(true)
    +            .build();
    +
    +    public static final PropertyDescriptor MORPHLINES_FILE = new PropertyDescriptor
    +            .Builder().name("Morphlines File")
    +            .description("File for the morphlines context")
    +            .required(true)
    +            .addValidator(StandardValidators.FILE_EXISTS_VALIDATOR)
    +            .expressionLanguageSupported(true)
    +            .build();
    +
    +    public static final PropertyDescriptor MORPHLINES_OUTPUT_FIELD = new PropertyDescriptor
    +            .Builder().name("Morphlines output field")
    +            .description("Field name of output in Morphlines. Default is '_attachment_body'.")
    +            .required(false)
    +	    .expressionLanguageSupported(true)
    +            .defaultValue("_attachment_body")
    +            .addValidator(StandardValidators.NON_EMPTY_VALIDATOR)
    +            .build();
    +
    +    public static final Relationship REL_SUCCESS = new Relationship.Builder()
    +            .name("success")
    +            .description("Relationship for success.")
    +            .build();
    +
    +    public static final Relationship REL_FAILURE = new Relationship.Builder()
    +            .name("failure")
    +            .description("Relationship for failure of morphlines.")
    +            .build();
    +
    +    private static final List<PropertyDescriptor> PROPERTIES = ImmutableList.<PropertyDescriptor>builder()
    +            .add(MORPHLINES_FILE)
    +            .add(MORPHLINES_ID)
    +            .add(MORPHLINES_OUTPUT_FIELD)
    +            .build();
    +
    +    private static final Set<Relationship> RELATIONSHIPS = ImmutableSet.<Relationship>builder()
    +            .add(REL_SUCCESS)
    +            .add(REL_FAILURE)
    +            .build();
    +
    +
    +    private File morphLinesFile;
    +    private String morphLinesId;
    +    private String morphlinesOutputField;
    +    private PropertyValue morphLinesFileProperty;
    +    private PropertyValue morphLinesIdProperty;
    +    private PropertyValue morphLinesOutputField;
    +    private MorphlineContext morphlineContext;
    +
    +    @Override
    +    public Set<Relationship> getRelationships() {
    +        return RELATIONSHIPS;
    +    }
    +
    +    @Override
    +    public final List<PropertyDescriptor> getSupportedPropertyDescriptors() {
    +        return PROPERTIES;
    +    }
    +
    +    @OnScheduled
    +    public void onScheduled(ProcessContext context) throws Exception {
    +	morphlinesFileProperty = context.getProperty(MORPHLINES_FILE);
    +	morphlinesIdProperty = context.getProperty(MORPHLINES_ID);
    +	morphlinesOutputFieldProperty = context.getProperty(MORPHLINES_OUTPUT_FIELD);
    +        morphlineContext = new MorphlineContext.Builder().build();
    +    }
    +
    +    @Override
    +    public void onTrigger(final ProcessContext context, final ProcessSession session) throws ProcessException {
    +        FlowFile flowFile = session.get();
    +        if ( flowFile == null ) {
    +            return;
    +        }
    +
    +	morphLinesFile = new File(morphlinesFileProperty.evaluateAttributeExpressions(flowfile).getValue());
    +	morphLinesId = morphlinesIdProperty.evaluateAttributeExpressions(flowfile).getValue();
    +	morphlinesOutputField = morphlinesOutputFieldProperty.evaluateAttributeExpressions(flowfile).getValue();
    +
    +        final AtomicLong written = new AtomicLong(0L);
    +        final byte[] value = new byte[(int) flowFile.getSize()];
    +
    +        try{
    +            flowFile = session.write(flowFile, new StreamCallback() {
    +                @Override
    +                public void process(InputStream in, OutputStream out) throws IOException {
    +                    StreamUtils.fillBuffer(in, value);
    +                    Record record = new Record();
    +                    record.put(Fields.ATTACHMENT_BODY, value);
    +                    Collector collectorRecord = new Collector();
    +                    morphline = new org.kitesdk.morphline.base.Compiler().compile(morphLinesFile, morphLinesId, morphlineContext, collectorRecord);
    --- End diff --
    
    You said before that you couldn't move this out of the onTrigger due to needing the collectorRecord to create and compile it.
    
    According to the Cloudera intro documentation "Example Driver Program". It creates the compiler then processes each record. Can't we do the same here? Relevant section below:
    
    ```
    � MorphlineContext context = new MorphlineContext.Builder().build();
    � Command morphline = new Compiler().compile(configFile, null, context, null);
    
    � // process each input data file
    � Notifications.notifyBeginTransaction(morphline);
    � for (int i = 1; i < args.length; i++) {
    ��� InputStream in = new FileInputStream(new File(args[i]));
    ��� Record record = new Record();
    ��� record.put(Fields.ATTACHMENT_BODY, in);
    ��� morphline.process(record);
    ��� in.close();
    � }
    ```
    
    
    
    [1] http://blog.cloudera.com/blog/2013/07/morphlines-the-easy-way-to-build-and-integrate-etl-apps-for-apache-hadoop/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] nifi issue #1576: NIFI-3518 Create a Morphlines processor

Posted by JPercivall <gi...@git.apache.org>.
Github user JPercivall commented on the issue:

    https://github.com/apache/nifi/pull/1576
  
    Hey @WilliamNouet, sorry for not giving an update. Been too busy with other obligations. I'll try to get back to this when I have more free time.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] nifi pull request #1576: NIFI-3518 Create a Morphlines processor

Posted by JPercivall <gi...@git.apache.org>.
Github user JPercivall commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/1576#discussion_r110552988
  
    --- Diff: nifi-nar-bundles/nifi-morphlines-bundle/nifi-morphlines-nar/pom.xml ---
    @@ -0,0 +1,41 @@
    +<?xml version="1.0" encoding="UTF-8"?>
    +<!--
    +  Licensed to the Apache Software Foundation (ASF) under one or more
    +  contributor license agreements. See the NOTICE file distributed with
    +  this work for additional information regarding copyright ownership.
    +  The ASF licenses this file to You under the Apache License, Version 2.0
    +  (the "License"); you may not use this file except in compliance with
    +  the License. You may obtain a copy of the License at
    +  http://www.apache.org/licenses/LICENSE-2.0
    +  Unless required by applicable law or agreed to in writing, software
    +  distributed under the License is distributed on an "AS IS" BASIS,
    +  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +  See the License for the specific language governing permissions and
    +  limitations under the License.
    +-->
    +<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    +    <modelVersion>4.0.0</modelVersion>
    +
    +    <parent>
    +        <groupId>org.apache.com</groupId>
    +        <artifactId>nifi-morphlines-bundle</artifactId>
    +        <version>1.2.0-SNAPSHOT</version>
    +    </parent>
    +
    +    <artifactId>nifi-morphlines-nar</artifactId>
    +    <version>1.2.0-SNAPSHOT</version>
    +    <packaging>nar</packaging>
    +    <properties>
    +        <maven.javadoc.skip>true</maven.javadoc.skip>
    +        <source.skip>true</source.skip>
    +    </properties>
    +
    +    <dependencies>
    +        <dependency>
    +            <groupId>org.apache.nifi</groupId>
    +            <artifactId>nifi-morphlines-processors</artifactId>
    --- End diff --
    
    You need to add a license/notice file for this nar and add any needed additions to the nifi-assembly license/notice files. Every dependency you bring in (including their transitive deps) need to be accounted for.
    
    You can see everything you're bringing in by running "mvn dependency:tree -Dverbose".
    
    Our licensing guide is here: https://nifi.apache.org/licensing-guide.html


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] nifi issue #1576: NIFI-3518 Create a Morphlines processor

Posted by WilliamNouet <gi...@git.apache.org>.
Github user WilliamNouet commented on the issue:

    https://github.com/apache/nifi/pull/1576
  
    This follows the work done on https://github.com/apache/nifi/pull/1529. Opened a new PR for cleaner work.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] nifi pull request #1576: NIFI-3518 Create a Morphlines processor

Posted by JPercivall <gi...@git.apache.org>.
Github user JPercivall commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/1576#discussion_r110552745
  
    --- Diff: nifi-nar-bundles/nifi-morphlines-bundle/nifi-morphlines-nar/pom.xml ---
    @@ -0,0 +1,41 @@
    +<?xml version="1.0" encoding="UTF-8"?>
    +<!--
    +  Licensed to the Apache Software Foundation (ASF) under one or more
    +  contributor license agreements. See the NOTICE file distributed with
    +  this work for additional information regarding copyright ownership.
    +  The ASF licenses this file to You under the Apache License, Version 2.0
    +  (the "License"); you may not use this file except in compliance with
    +  the License. You may obtain a copy of the License at
    +  http://www.apache.org/licenses/LICENSE-2.0
    +  Unless required by applicable law or agreed to in writing, software
    +  distributed under the License is distributed on an "AS IS" BASIS,
    +  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +  See the License for the specific language governing permissions and
    +  limitations under the License.
    +-->
    +<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    +    <modelVersion>4.0.0</modelVersion>
    +
    +    <parent>
    +        <groupId>org.apache.com</groupId>
    --- End diff --
    
    This is wrong, should be "org.apache.nifi".


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] nifi issue #1576: NIFI-3518 Create a Morphlines processor

Posted by JPercivall <gi...@git.apache.org>.
Github user JPercivall commented on the issue:

    https://github.com/apache/nifi/pull/1576
  
    Sorry for the delay @WilliamNouet . I am going to look at it now. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] nifi pull request #1576: NIFI-3518 Create a Morphlines processor

Posted by JPercivall <gi...@git.apache.org>.
Github user JPercivall commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/1576#discussion_r110552955
  
    --- Diff: nifi-nar-bundles/nifi-morphlines-bundle/nifi-morphlines-nar/pom.xml ---
    @@ -0,0 +1,41 @@
    +<?xml version="1.0" encoding="UTF-8"?>
    +<!--
    +  Licensed to the Apache Software Foundation (ASF) under one or more
    +  contributor license agreements. See the NOTICE file distributed with
    +  this work for additional information regarding copyright ownership.
    +  The ASF licenses this file to You under the Apache License, Version 2.0
    +  (the "License"); you may not use this file except in compliance with
    +  the License. You may obtain a copy of the License at
    +  http://www.apache.org/licenses/LICENSE-2.0
    +  Unless required by applicable law or agreed to in writing, software
    +  distributed under the License is distributed on an "AS IS" BASIS,
    +  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +  See the License for the specific language governing permissions and
    +  limitations under the License.
    +-->
    +<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    +    <modelVersion>4.0.0</modelVersion>
    +
    +    <parent>
    +        <groupId>org.apache.com</groupId>
    +        <artifactId>nifi-morphlines-bundle</artifactId>
    +        <version>1.2.0-SNAPSHOT</version>
    +    </parent>
    +
    +    <artifactId>nifi-morphlines-nar</artifactId>
    --- End diff --
    
    In order for the nifi-assembly pom to know to include it into the final package, you need to add it as a dependency of nifi-assembly/pom.xml.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] nifi pull request #1576: NIFI-3518 Create a Morphlines processor

Posted by JPercivall <gi...@git.apache.org>.
Github user JPercivall commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/1576#discussion_r110553041
  
    --- Diff: nifi-nar-bundles/nifi-morphlines-bundle/nifi-morphlines-processors/src/main/java/org/apache/nifi/processors/morphlines/ImplementMorphlines.java ---
    @@ -0,0 +1,219 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.nifi.processors.morphlines;
    +
    +import com.google.common.base.Preconditions;
    +import org.apache.nifi.components.PropertyDescriptor;
    +import org.apache.nifi.components.PropertyValue;
    +import org.apache.nifi.flowfile.FlowFile;
    +import org.apache.nifi.processor.*;
    +import org.apache.nifi.annotation.documentation.CapabilityDescription;
    +import org.apache.nifi.annotation.documentation.Tags;
    +import org.apache.nifi.annotation.lifecycle.OnScheduled;
    +import org.apache.nifi.processor.exception.ProcessException;
    +import org.apache.nifi.processor.io.InputStreamCallback;
    +import org.apache.nifi.processor.io.OutputStreamCallback;
    +import org.apache.nifi.processor.io.StreamCallback;
    +import org.apache.nifi.processor.util.StandardValidators;
    +import org.apache.nifi.stream.io.StreamUtils;
    +import org.kitesdk.morphline.api.Command;
    +import org.kitesdk.morphline.api.MorphlineContext;
    +import org.kitesdk.morphline.api.Record;
    +import org.kitesdk.morphline.base.Fields;
    +
    +import org.kitesdk.morphline.base.Notifications;
    +import com.google.common.collect.ImmutableList;
    +import com.google.common.collect.ImmutableSet;
    +import org.apache.nifi.annotation.lifecycle.OnScheduled;
    +
    +import java.io.File;
    +import java.io.IOException;
    +import java.io.InputStream;
    +import java.io.OutputStream;
    +import java.util.*;
    +import java.util.stream.*;
    +import java.util.concurrent.atomic.*;
    +import org.apache.nifi.annotation.lifecycle.OnStopped;
    +import org.apache.nifi.processor.exception.*;
    --- End diff --
    
    Please run "mvn clean install -Pcontrib-check" in order to see all your checkstyle issues.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] nifi issue #1576: NIFI-3518 Create a Morphlines processor

Posted by WilliamNouet <gi...@git.apache.org>.
Github user WilliamNouet commented on the issue:

    https://github.com/apache/nifi/pull/1576
  
    Closing this one and opening PR #2028 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] nifi pull request #1576: NIFI-3518 Create a Morphlines processor

Posted by JPercivall <gi...@git.apache.org>.
Github user JPercivall commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/1576#discussion_r110552773
  
    --- Diff: nifi-nar-bundles/nifi-morphlines-bundle/nifi-morphlines-processors/src/main/java/org/apache/nifi/processors/morphlines/ImplementMorphlines.java ---
    @@ -0,0 +1,219 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.nifi.processors.morphlines;
    +
    +import com.google.common.base.Preconditions;
    +import org.apache.nifi.components.PropertyDescriptor;
    +import org.apache.nifi.components.PropertyValue;
    +import org.apache.nifi.flowfile.FlowFile;
    +import org.apache.nifi.processor.*;
    +import org.apache.nifi.annotation.documentation.CapabilityDescription;
    +import org.apache.nifi.annotation.documentation.Tags;
    +import org.apache.nifi.annotation.lifecycle.OnScheduled;
    +import org.apache.nifi.processor.exception.ProcessException;
    +import org.apache.nifi.processor.io.InputStreamCallback;
    +import org.apache.nifi.processor.io.OutputStreamCallback;
    +import org.apache.nifi.processor.io.StreamCallback;
    +import org.apache.nifi.processor.util.StandardValidators;
    +import org.apache.nifi.stream.io.StreamUtils;
    +import org.kitesdk.morphline.api.Command;
    +import org.kitesdk.morphline.api.MorphlineContext;
    +import org.kitesdk.morphline.api.Record;
    +import org.kitesdk.morphline.base.Fields;
    +
    +import org.kitesdk.morphline.base.Notifications;
    +import com.google.common.collect.ImmutableList;
    +import com.google.common.collect.ImmutableSet;
    +import org.apache.nifi.annotation.lifecycle.OnScheduled;
    +
    +import java.io.File;
    +import java.io.IOException;
    +import java.io.InputStream;
    +import java.io.OutputStream;
    +import java.util.*;
    +import java.util.stream.*;
    +import java.util.concurrent.atomic.*;
    +import org.apache.nifi.annotation.lifecycle.OnStopped;
    +import org.apache.nifi.processor.exception.*;
    +
    +@Tags({"kitesdk", "morphlines", "ETL", "HDFS", "avro", "Solr", "HBase"})
    +@CapabilityDescription("Implements Morphlines (http://kitesdk.org/docs/1.1.0/morphlines/) framework, which performs in-memory container of transformation commands in oder to perform tasks such as loading, parsing, transforming, or otherwise processing a single record.")
    +public class MorphlinesProcessor extends AbstractProcessor {
    +
    +    private Command morphline;
    +    private volatile Record record = new Record();
    +    private volatile Collector collector = new Collector();
    +
    +    public static final PropertyDescriptor MORPHLINES_ID = new PropertyDescriptor
    +            .Builder().name("Morphlines ID")
    +            .description("Identifier of the morphlines context")
    +            .required(true)
    +            .addValidator(StandardValidators.NON_EMPTY_VALIDATOR)
    +            .expressionLanguageSupported(true)
    +            .build();
    +
    +    public static final PropertyDescriptor MORPHLINES_FILE = new PropertyDescriptor
    +            .Builder().name("Morphlines File")
    +            .description("File for the morphlines context")
    +            .required(true)
    +            .addValidator(StandardValidators.FILE_EXISTS_VALIDATOR)
    +            .expressionLanguageSupported(true)
    +            .build();
    +
    +    public static final PropertyDescriptor MORPHLINES_OUTPUT_FIELD = new PropertyDescriptor
    +            .Builder().name("Morphlines output field")
    +            .description("Field name of output in Morphlines. Default is '_attachment_body'.")
    +            .required(false)
    +	    .expressionLanguageSupported(true)
    +            .defaultValue("_attachment_body")
    +            .addValidator(StandardValidators.NON_EMPTY_VALIDATOR)
    +            .build();
    +
    +    public static final Relationship REL_SUCCESS = new Relationship.Builder()
    +            .name("success")
    +            .description("Relationship for success.")
    +            .build();
    +
    +    public static final Relationship REL_FAILURE = new Relationship.Builder()
    +            .name("failure")
    +            .description("Relationship for failure of morphlines.")
    +            .build();
    +
    +    private static final List<PropertyDescriptor> PROPERTIES = ImmutableList.<PropertyDescriptor>builder()
    +            .add(MORPHLINES_FILE)
    +            .add(MORPHLINES_ID)
    +            .add(MORPHLINES_OUTPUT_FIELD)
    +            .build();
    +
    +    private static final Set<Relationship> RELATIONSHIPS = ImmutableSet.<Relationship>builder()
    +            .add(REL_SUCCESS)
    +            .add(REL_FAILURE)
    +            .build();
    +
    +
    +    private File morphLinesFile;
    +    private String morphLinesId;
    +    private String morphlinesOutputField;
    +    private PropertyValue morphLinesFileProperty;
    +    private PropertyValue morphLinesIdProperty;
    +    private PropertyValue morphLinesOutputField;
    +    private MorphlineContext morphlineContext;
    --- End diff --
    
    You didn't address my comment[1] on these variables. Please do so.
    
    [1] https://github.com/apache/nifi/pull/1529#discussion_r102964711


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---