You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@nifi.apache.org by Andy Christianson <ac...@hortonworks.com> on 2017/07/12 16:26:45 UTC

MINIFI-350 minifi-cpp end-to-end integration testing framework

Hi All,

I am looking at MINIFI-350 and would like to implement some end-to-end integration tests for minifi cpp. Essentially, the tests would:


  1.  Stand up a new minifi cpp docker container
  2.  Send test data to HTTP input ports on the container
  3.  Run data through a minifi flow
  4.  Receive output data to a HTTP endpoint
  5.  Verify output data according to some constraints (headers present, hash of the content, etc.)

Most of this work, such as setting up a docker container and sending data to it, can naturally be done with shell commands. As such, I’ve taken a look at the bats [1] testing framework, which seems simple enough and is very expressive.

Any thoughts or suggestions on test frameworks to use are appreciated.

[1]: https://github.com/sstephenson/bats

Re: MINIFI-350 minifi-cpp end-to-end integration testing framework

Posted by Andy Christianson <ac...@hortonworks.com>.

MiNiFi cpp team,

I have created the initial pytest/docker based test framework as well as a few initial test cases. Please review & merge the PR (https://github.com/apache/nifi-minifi-cpp/pull/126) at your convenience.

Regards,

Andy I.C.
________________________________________
From: kangaxx84@gmail.com <ka...@gmail.com> on behalf of Haimo Liu <ha...@gmail.com>
Sent: Thursday, July 13, 2017 2:07 PM
To: dev@nifi.apache.org
Subject: Re: MINIFI-350 minifi-cpp end-to-end integration testing framework

great idea Andy! I can see this being extremely valuable even outside of
the MINIFI cpp context. Specifically, to migrate my dataflow from one
environment to another (DEV to QA to PROD), an integration testing
framework could be very helpful for flow validation purposes.

in addition to testing your MINIFI agents and network connectivities, have
you taken into consideration the integration testing of a potentially very
complex dataflow itself? Say I am collecting data from 50 data sources, and
ingesting to 20 different targets, may I leverage your testing framework to
spin up necessary containers (HDFS, Hbase, Oracle, etc. just different end
points) and run a docker compose script to validate my flow during
migration? Would be very nice to see your framework to be designed
extensible in a way to cover flow specific testing as well. Maybe you
already have it all sorted out :)

Thanks,
Haimo

On Thu, Jul 13, 2017 at 1:50 PM, Andy Christianson <
achristianson@hortonworks.com> wrote:

> Thanks for the feedback. I will put together a proof of concept which we
> can further evaluate/refine/merge upstream.
>
> -Andy
>
> On 7/13/17, 11:30 AM, "Kevin Doran" <kd...@gmail.com> wrote:
>
>     Great idea, Andy! Additional types of automated tests would help the
> minifi-cpp project significantly, and I think your proposal is an
> appropriate way to add integration tests for the minifi agent. This sounds
> like a great way to verify expected behavior of processors and the system
> of components in flow combinations.
>
>     I like the idea of declarative tests that are interpreted / run by a
> harness or framework as a way to allow others can contribute test cases.
>
>     I've never used the Bats framework before, but it seems like a
> reasonable option for what you describe. It might require writing a fair
> amount of bash code under-the-hood to get the functionality you want
> (helper functions and such), but it looks like it would keep the test cases
> themselves and the output clean and light. Perhaps others can offer
> suggestions here.
>
>     One comment, which you've probably already considered, is that we
> should keep the dependencies (if any) that get added for integration tests
> that leverage the docker target optional so they are not required for folks
> that just want to build libminifi or the agent. It would be more of a
> developer/contributor option but users could skip these tests.
>
>     /docker/test/integration seems like a reasonable place to add test
> cases. Others would probably know better. I think the README.md would be a
> reasonable place to document how to run the tests with a reference to
> another document that describes how to add / contribute new test cases. I'm
> not sure where the best location for the documentation should live.
>
>     Thanks,
>     Kevin
>
>     On 7/13/17, 10:34, "Andy Christianson" <ac...@hortonworks.com>
> wrote:
>
>         Yes, I envision having a directory of declarative test cases. Each
> would include a flow yaml, one or more input files, and expected outputs.
>
>         I’d like to document the convention before writing the
> implementation because if the conventions are solid, we can change out the
> actual test driver implementation later on if needed.
>
>         Would it be best to document this in a section within /README.md,
> or should I add a new file such as /docs/Testing.md, or /TESTING.md?
>
>         As for where the test cases would be added, I was thinking maybe
> /docker/test/integration, keeping consistent with the existing convention
> (i.e. /libminifi/test/integration).
>
>         -Andy
>
>         On 7/13/17, 10:14 AM, "Marc" <ph...@apache.org> wrote:
>
>             Hi Andy,
>                I think this is a great idea to test integrating MiNiFi
> among multiple
>             system components. Do you have a feel for how you will allow
> others to
>             create test cases? Will you attempt to minimize the footprint
> of
>             contributed tests by creating a bats based framework? I ask
> because it
>             would be cool if contributors could supply a flow ( input )
> and expected
>             output and we automatically run the necessary
> containers/components. Is
>             this along the lines of your vision?
>
>               Thanks,
>                Marc
>
>             On Wed, Jul 12, 2017 at 12:26 PM, Andy Christianson <
>             achristianson@hortonworks.com> wrote:
>
>             > Hi All,
>             >
>             > I am looking at MINIFI-350 and would like to implement some
> end-to-end
>             > integration tests for minifi cpp. Essentially, the tests
> would:
>             >
>             >
>             >   1.  Stand up a new minifi cpp docker container
>             >   2.  Send test data to HTTP input ports on the container
>             >   3.  Run data through a minifi flow
>             >   4.  Receive output data to a HTTP endpoint
>             >   5.  Verify output data according to some constraints
> (headers present,
>             > hash of the content, etc.)
>             >
>             > Most of this work, such as setting up a docker container and
> sending data
>             > to it, can naturally be done with shell commands. As such,
> I’ve taken a
>             > look at the bats [1] testing framework, which seems simple
> enough and is
>             > very expressive.
>             >
>             > Any thoughts or suggestions on test frameworks to use are
> appreciated.
>             >
>             > [1]: https://github.com/sstephenson/bats
>             >
>             >
>
>
>
>
>
>
>
>
>

Re: MINIFI-350 minifi-cpp end-to-end integration testing framework

Posted by Haimo Liu <ha...@gmail.com>.

great idea Andy! I can see this being extremely valuable even outside of
the MINIFI cpp context. Specifically, to migrate my dataflow from one
environment to another (DEV to QA to PROD), an integration testing
framework could be very helpful for flow validation purposes.

in addition to testing your MINIFI agents and network connectivities, have
you taken into consideration the integration testing of a potentially very
complex dataflow itself? Say I am collecting data from 50 data sources, and
ingesting to 20 different targets, may I leverage your testing framework to
spin up necessary containers (HDFS, Hbase, Oracle, etc. just different end
points) and run a docker compose script to validate my flow during
migration? Would be very nice to see your framework to be designed
extensible in a way to cover flow specific testing as well. Maybe you
already have it all sorted out :)

Thanks,
Haimo

On Thu, Jul 13, 2017 at 1:50 PM, Andy Christianson <
achristianson@hortonworks.com> wrote:

> Thanks for the feedback. I will put together a proof of concept which we
> can further evaluate/refine/merge upstream.
>
> -Andy
>
> On 7/13/17, 11:30 AM, "Kevin Doran" <kd...@gmail.com> wrote:
>
>     Great idea, Andy! Additional types of automated tests would help the
> minifi-cpp project significantly, and I think your proposal is an
> appropriate way to add integration tests for the minifi agent. This sounds
> like a great way to verify expected behavior of processors and the system
> of components in flow combinations.
>
>     I like the idea of declarative tests that are interpreted / run by a
> harness or framework as a way to allow others can contribute test cases.
>
>     I've never used the Bats framework before, but it seems like a
> reasonable option for what you describe. It might require writing a fair
> amount of bash code under-the-hood to get the functionality you want
> (helper functions and such), but it looks like it would keep the test cases
> themselves and the output clean and light. Perhaps others can offer
> suggestions here.
>
>     One comment, which you've probably already considered, is that we
> should keep the dependencies (if any) that get added for integration tests
> that leverage the docker target optional so they are not required for folks
> that just want to build libminifi or the agent. It would be more of a
> developer/contributor option but users could skip these tests.
>
>     /docker/test/integration seems like a reasonable place to add test
> cases. Others would probably know better. I think the README.md would be a
> reasonable place to document how to run the tests with a reference to
> another document that describes how to add / contribute new test cases. I'm
> not sure where the best location for the documentation should live.
>
>     Thanks,
>     Kevin
>
>     On 7/13/17, 10:34, "Andy Christianson" <ac...@hortonworks.com>
> wrote:
>
>         Yes, I envision having a directory of declarative test cases. Each
> would include a flow yaml, one or more input files, and expected outputs.
>
>         I’d like to document the convention before writing the
> implementation because if the conventions are solid, we can change out the
> actual test driver implementation later on if needed.
>
>         Would it be best to document this in a section within /README.md,
> or should I add a new file such as /docs/Testing.md, or /TESTING.md?
>
>         As for where the test cases would be added, I was thinking maybe
> /docker/test/integration, keeping consistent with the existing convention
> (i.e. /libminifi/test/integration).
>
>         -Andy
>
>         On 7/13/17, 10:14 AM, "Marc" <ph...@apache.org> wrote:
>
>             Hi Andy,
>                I think this is a great idea to test integrating MiNiFi
> among multiple
>             system components. Do you have a feel for how you will allow
> others to
>             create test cases? Will you attempt to minimize the footprint
> of
>             contributed tests by creating a bats based framework? I ask
> because it
>             would be cool if contributors could supply a flow ( input )
> and expected
>             output and we automatically run the necessary
> containers/components. Is
>             this along the lines of your vision?
>
>               Thanks,
>                Marc
>
>             On Wed, Jul 12, 2017 at 12:26 PM, Andy Christianson <
>             achristianson@hortonworks.com> wrote:
>
>             > Hi All,
>             >
>             > I am looking at MINIFI-350 and would like to implement some
> end-to-end
>             > integration tests for minifi cpp. Essentially, the tests
> would:
>             >
>             >
>             >   1.  Stand up a new minifi cpp docker container
>             >   2.  Send test data to HTTP input ports on the container
>             >   3.  Run data through a minifi flow
>             >   4.  Receive output data to a HTTP endpoint
>             >   5.  Verify output data according to some constraints
> (headers present,
>             > hash of the content, etc.)
>             >
>             > Most of this work, such as setting up a docker container and
> sending data
>             > to it, can naturally be done with shell commands. As such,
> I’ve taken a
>             > look at the bats [1] testing framework, which seems simple
> enough and is
>             > very expressive.
>             >
>             > Any thoughts or suggestions on test frameworks to use are
> appreciated.
>             >
>             > [1]: https://github.com/sstephenson/bats
>             >
>             >
>
>
>
>
>
>
>
>
>

Re: MINIFI-350 minifi-cpp end-to-end integration testing framework

Posted by Andy Christianson <ac...@hortonworks.com>.

Thanks for the feedback. I will put together a proof of concept which we can further evaluate/refine/merge upstream.

-Andy

On 7/13/17, 11:30 AM, "Kevin Doran" <kd...@gmail.com> wrote:

    Great idea, Andy! Additional types of automated tests would help the minifi-cpp project significantly, and I think your proposal is an appropriate way to add integration tests for the minifi agent. This sounds like a great way to verify expected behavior of processors and the system of components in flow combinations.
    
    I like the idea of declarative tests that are interpreted / run by a harness or framework as a way to allow others can contribute test cases.
    
    I've never used the Bats framework before, but it seems like a reasonable option for what you describe. It might require writing a fair amount of bash code under-the-hood to get the functionality you want (helper functions and such), but it looks like it would keep the test cases themselves and the output clean and light. Perhaps others can offer suggestions here.
    
    One comment, which you've probably already considered, is that we should keep the dependencies (if any) that get added for integration tests that leverage the docker target optional so they are not required for folks that just want to build libminifi or the agent. It would be more of a developer/contributor option but users could skip these tests.
    
    /docker/test/integration seems like a reasonable place to add test cases. Others would probably know better. I think the README.md would be a reasonable place to document how to run the tests with a reference to another document that describes how to add / contribute new test cases. I'm not sure where the best location for the documentation should live.
    
    Thanks,
    Kevin
    
    On 7/13/17, 10:34, "Andy Christianson" <ac...@hortonworks.com> wrote:
    
        Yes, I envision having a directory of declarative test cases. Each would include a flow yaml, one or more input files, and expected outputs.
        
        I’d like to document the convention before writing the implementation because if the conventions are solid, we can change out the actual test driver implementation later on if needed.
        
        Would it be best to document this in a section within /README.md, or should I add a new file such as /docs/Testing.md, or /TESTING.md?
        
        As for where the test cases would be added, I was thinking maybe /docker/test/integration, keeping consistent with the existing convention (i.e. /libminifi/test/integration).
        
        -Andy
        
        On 7/13/17, 10:14 AM, "Marc" <ph...@apache.org> wrote:
        
            Hi Andy,
               I think this is a great idea to test integrating MiNiFi among multiple
            system components. Do you have a feel for how you will allow others to
            create test cases? Will you attempt to minimize the footprint of
            contributed tests by creating a bats based framework? I ask because it
            would be cool if contributors could supply a flow ( input ) and expected
            output and we automatically run the necessary containers/components. Is
            this along the lines of your vision?
            
              Thanks,
               Marc
            
            On Wed, Jul 12, 2017 at 12:26 PM, Andy Christianson <
            achristianson@hortonworks.com> wrote:
            
            > Hi All,
            >
            > I am looking at MINIFI-350 and would like to implement some end-to-end
            > integration tests for minifi cpp. Essentially, the tests would:
            >
            >
            >   1.  Stand up a new minifi cpp docker container
            >   2.  Send test data to HTTP input ports on the container
            >   3.  Run data through a minifi flow
            >   4.  Receive output data to a HTTP endpoint
            >   5.  Verify output data according to some constraints (headers present,
            > hash of the content, etc.)
            >
            > Most of this work, such as setting up a docker container and sending data
            > to it, can naturally be done with shell commands. As such, I’ve taken a
            > look at the bats [1] testing framework, which seems simple enough and is
            > very expressive.
            >
            > Any thoughts or suggestions on test frameworks to use are appreciated.
            >
            > [1]: https://github.com/sstephenson/bats
            >
            >

Re: MINIFI-350 minifi-cpp end-to-end integration testing framework

Posted by Kevin Doran <kd...@gmail.com>.

Great idea, Andy! Additional types of automated tests would help the minifi-cpp project significantly, and I think your proposal is an appropriate way to add integration tests for the minifi agent. This sounds like a great way to verify expected behavior of processors and the system of components in flow combinations.

I like the idea of declarative tests that are interpreted / run by a harness or framework as a way to allow others can contribute test cases.

I've never used the Bats framework before, but it seems like a reasonable option for what you describe. It might require writing a fair amount of bash code under-the-hood to get the functionality you want (helper functions and such), but it looks like it would keep the test cases themselves and the output clean and light. Perhaps others can offer suggestions here.

One comment, which you've probably already considered, is that we should keep the dependencies (if any) that get added for integration tests that leverage the docker target optional so they are not required for folks that just want to build libminifi or the agent. It would be more of a developer/contributor option but users could skip these tests.

/docker/test/integration seems like a reasonable place to add test cases. Others would probably know better. I think the README.md would be a reasonable place to document how to run the tests with a reference to another document that describes how to add / contribute new test cases. I'm not sure where the best location for the documentation should live.

Thanks,
Kevin

On 7/13/17, 10:34, "Andy Christianson" <ac...@hortonworks.com> wrote:

    Yes, I envision having a directory of declarative test cases. Each would include a flow yaml, one or more input files, and expected outputs.

    I’d like to document the convention before writing the implementation because if the conventions are solid, we can change out the actual test driver implementation later on if needed.

    Would it be best to document this in a section within /README.md, or should I add a new file such as /docs/Testing.md, or /TESTING.md?

    As for where the test cases would be added, I was thinking maybe /docker/test/integration, keeping consistent with the existing convention (i.e. /libminifi/test/integration).

    -Andy

    On 7/13/17, 10:14 AM, "Marc" <ph...@apache.org> wrote:

        Hi Andy,
           I think this is a great idea to test integrating MiNiFi among multiple
        system components. Do you have a feel for how you will allow others to
        create test cases? Will you attempt to minimize the footprint of
        contributed tests by creating a bats based framework? I ask because it
        would be cool if contributors could supply a flow ( input ) and expected
        output and we automatically run the necessary containers/components. Is
        this along the lines of your vision?

          Thanks,
           Marc

        On Wed, Jul 12, 2017 at 12:26 PM, Andy Christianson <
        achristianson@hortonworks.com> wrote:

        > Hi All,
        >
        > I am looking at MINIFI-350 and would like to implement some end-to-end
        > integration tests for minifi cpp. Essentially, the tests would:
        >
        >
        >   1.  Stand up a new minifi cpp docker container
        >   2.  Send test data to HTTP input ports on the container
        >   3.  Run data through a minifi flow
        >   4.  Receive output data to a HTTP endpoint
        >   5.  Verify output data according to some constraints (headers present,
        > hash of the content, etc.)
        >
        > Most of this work, such as setting up a docker container and sending data
        > to it, can naturally be done with shell commands. As such, I’ve taken a
        > look at the bats [1] testing framework, which seems simple enough and is
        > very expressive.
        >
        > Any thoughts or suggestions on test frameworks to use are appreciated.
        >
        > [1]: https://github.com/sstephenson/bats
        >
        >

Re: MINIFI-350 minifi-cpp end-to-end integration testing framework

Posted by Andy Christianson <ac...@hortonworks.com>.

Yes, I envision having a directory of declarative test cases. Each would include a flow yaml, one or more input files, and expected outputs.

I’d like to document the convention before writing the implementation because if the conventions are solid, we can change out the actual test driver implementation later on if needed.

Would it be best to document this in a section within /README.md, or should I add a new file such as /docs/Testing.md, or /TESTING.md?

As for where the test cases would be added, I was thinking maybe /docker/test/integration, keeping consistent with the existing convention (i.e. /libminifi/test/integration).

-Andy

On 7/13/17, 10:14 AM, "Marc" <ph...@apache.org> wrote:

    Hi Andy,
       I think this is a great idea to test integrating MiNiFi among multiple
    system components. Do you have a feel for how you will allow others to
    create test cases? Will you attempt to minimize the footprint of
    contributed tests by creating a bats based framework? I ask because it
    would be cool if contributors could supply a flow ( input ) and expected
    output and we automatically run the necessary containers/components. Is
    this along the lines of your vision?

      Thanks,
       Marc

    On Wed, Jul 12, 2017 at 12:26 PM, Andy Christianson <
    achristianson@hortonworks.com> wrote:

    > Hi All,
    >
    > I am looking at MINIFI-350 and would like to implement some end-to-end
    > integration tests for minifi cpp. Essentially, the tests would:
    >
    >
    >   1.  Stand up a new minifi cpp docker container
    >   2.  Send test data to HTTP input ports on the container
    >   3.  Run data through a minifi flow
    >   4.  Receive output data to a HTTP endpoint
    >   5.  Verify output data according to some constraints (headers present,
    > hash of the content, etc.)
    >
    > Most of this work, such as setting up a docker container and sending data
    > to it, can naturally be done with shell commands. As such, I’ve taken a
    > look at the bats [1] testing framework, which seems simple enough and is
    > very expressive.
    >
    > Any thoughts or suggestions on test frameworks to use are appreciated.
    >
    > [1]: https://github.com/sstephenson/bats
    >
    >

Re: MINIFI-350 minifi-cpp end-to-end integration testing framework

Posted by Marc <ph...@apache.org>.

Hi Andy,
   I think this is a great idea to test integrating MiNiFi among multiple
system components. Do you have a feel for how you will allow others to
create test cases? Will you attempt to minimize the footprint of
contributed tests by creating a bats based framework? I ask because it
would be cool if contributors could supply a flow ( input ) and expected
output and we automatically run the necessary containers/components. Is
this along the lines of your vision?

  Thanks,
   Marc

On Wed, Jul 12, 2017 at 12:26 PM, Andy Christianson <
achristianson@hortonworks.com> wrote:

> Hi All,
>
> I am looking at MINIFI-350 and would like to implement some end-to-end
> integration tests for minifi cpp. Essentially, the tests would:
>
>
>   1.  Stand up a new minifi cpp docker container
>   2.  Send test data to HTTP input ports on the container
>   3.  Run data through a minifi flow
>   4.  Receive output data to a HTTP endpoint
>   5.  Verify output data according to some constraints (headers present,
> hash of the content, etc.)
>
> Most of this work, such as setting up a docker container and sending data
> to it, can naturally be done with shell commands. As such, I’ve taken a
> look at the bats [1] testing framework, which seems simple enough and is
> very expressive.
>
> Any thoughts or suggestions on test frameworks to use are appreciated.
>
> [1]: https://github.com/sstephenson/bats
>
>