You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@metron.apache.org by Ryan Merriman <me...@gmail.com> on 2019/03/08 14:47:38 UTC

[DISCUSS] Upgrading HBase and Kafka support

I have been researching the effort involved to upgrade to HDP 3.  Along the
way I've found a couple challenging issues that we will need to solve, both
involving our integration testing strategy.

The first issue is Kafka.  We are moving from 0.10.0 to 2.0.0 and there
have been significant changes to the API.  This creates an issue in the
KafkaComponent class, which we use as an in-memory Kafka server in
integration tests.  Most of the classes that were previously used have gone
away, and to the best of my knowledge, were not supported as public APIs.
I also don't see any publicly documented APIs to replace them.

The second issue is HBase.  We are moving from 1.1.2 to 2.0.2 so another
significant change.  This creates an issue in the MockHTable class
becausethe HTableInterface class has changed to Table, essentially
requiring that MockHTable be rewritten to conform to the new interface.
It's my opinion that this class is complicated and difficult to maintain as
it is anyways.

These 2 issues have the potential to add a significant amount of work to
upgrading Metron to HDP 3.  I want to take a step back and review our
options before we move forward.  Here are some initial thoughts I had on
how to approach this.  For HBase:

   1. Update MockHTable to work with the new HBase API.  We would continue
   using a mock server approach for HBase.
   2. Research replacing MockHTable with an in-memory HBase server.
   3. Replace MockHTable with a Docker container running HBase.

For Kafka:

   1. Replace KafkaComponent with a mock server implementation.
   2. Update KafkaComponent to work with the new API.  We would probably
   need to leverage some internal Kafka classes.  I do not see a testing API
   documented publicly.
   3. Replace KafkaComponent with a Docker container running Kafka.

What other options are there?  Whatever we choose I think we should follow
a similar approach for both (mock servers, in memory servers, Docker, other
options I'm not thinking of).

This will not shock anyone but I would be in favor of Docker containers.
They have the advantage of classpath isolation, easy upgrades, and accurate
integration testing.  The downside is we will have to adjusts our tests and
travis script to incorporate these Docker containers into our build
process.  We have discussed this at length in the past and it has generally
stalled for various reasons.  Maybe if we move a few services at a time it
might be more palatable?  As for the other 2 approaches, I think if either
worked well we wouldn't be having this discussion.  Mock servers are hard
to maintain and I don't see in memory testing classes documented in
javadocs for either service.

Thoughts?

Re: [DISCUSS] Upgrading HBase and Kafka support

Posted by Michael Miklavcic <mi...@gmail.com>.
Jon, I didn't get a chance to review your PR, but it immediately came to
mind wrt this task - did you guys have that working in Travis with Maven?
What's the container lifecycle you came up with look like?

On Fri, Mar 8, 2019 at 8:30 AM Zeolla@GMail.com <ze...@gmail.com> wrote:

> +1 to option 3 on both.  Also strongly in favor of Docker.  We recently
> took a similar approach in metron-bro-plugin-kafka as well (link
> <https://github.com/apache/metron-bro-plugin-kafka/tree/master/docker>) to
> do end to end testing.
>
> Jon
>
> On Fri, Mar 8, 2019 at 9:53 AM Nick Allen <ni...@nickallen.org> wrote:
>
> > +1 for option 3.  I am in favor of using Docker for the integration tests
> > for all the reasons that you mentioned.
> >
> > On Fri, Mar 8, 2019 at 9:47 AM Ryan Merriman <me...@gmail.com>
> wrote:
> >
> > > I have been researching the effort involved to upgrade to HDP 3.  Along
> > the
> > > way I've found a couple challenging issues that we will need to solve,
> > both
> > > involving our integration testing strategy.
> > >
> > > The first issue is Kafka.  We are moving from 0.10.0 to 2.0.0 and there
> > > have been significant changes to the API.  This creates an issue in the
> > > KafkaComponent class, which we use as an in-memory Kafka server in
> > > integration tests.  Most of the classes that were previously used have
> > gone
> > > away, and to the best of my knowledge, were not supported as public
> APIs.
> > > I also don't see any publicly documented APIs to replace them.
> > >
> > > The second issue is HBase.  We are moving from 1.1.2 to 2.0.2 so
> another
> > > significant change.  This creates an issue in the MockHTable class
> > > becausethe HTableInterface class has changed to Table, essentially
> > > requiring that MockHTable be rewritten to conform to the new interface.
> > > It's my opinion that this class is complicated and difficult to
> maintain
> > as
> > > it is anyways.
> > >
> > > These 2 issues have the potential to add a significant amount of work
> to
> > > upgrading Metron to HDP 3.  I want to take a step back and review our
> > > options before we move forward.  Here are some initial thoughts I had
> on
> > > how to approach this.  For HBase:
> > >
> > >    1. Update MockHTable to work with the new HBase API.  We would
> > continue
> > >    using a mock server approach for HBase.
> > >    2. Research replacing MockHTable with an in-memory HBase server.
> > >    3. Replace MockHTable with a Docker container running HBase.
> > >
> > > For Kafka:
> > >
> > >    1. Replace KafkaComponent with a mock server implementation.
> > >    2. Update KafkaComponent to work with the new API.  We would
> probably
> > >    need to leverage some internal Kafka classes.  I do not see a
> testing
> > > API
> > >    documented publicly.
> > >    3. Replace KafkaComponent with a Docker container running Kafka.
> > >
> > > What other options are there?  Whatever we choose I think we should
> > follow
> > > a similar approach for both (mock servers, in memory servers, Docker,
> > other
> > > options I'm not thinking of).
> > >
> > > This will not shock anyone but I would be in favor of Docker
> containers.
> > > They have the advantage of classpath isolation, easy upgrades, and
> > accurate
> > > integration testing.  The downside is we will have to adjusts our tests
> > and
> > > travis script to incorporate these Docker containers into our build
> > > process.  We have discussed this at length in the past and it has
> > generally
> > > stalled for various reasons.  Maybe if we move a few services at a time
> > it
> > > might be more palatable?  As for the other 2 approaches, I think if
> > either
> > > worked well we wouldn't be having this discussion.  Mock servers are
> hard
> > > to maintain and I don't see in memory testing classes documented in
> > > javadocs for either service.
> > >
> > > Thoughts?
> > >
> >
> --
>
> Jon Zeolla
>

Re: [DISCUSS] Upgrading HBase and Kafka support

Posted by "Zeolla@GMail.com" <ze...@gmail.com>.
+1 to option 3 on both.  Also strongly in favor of Docker.  We recently
took a similar approach in metron-bro-plugin-kafka as well (link
<https://github.com/apache/metron-bro-plugin-kafka/tree/master/docker>) to
do end to end testing.

Jon

On Fri, Mar 8, 2019 at 9:53 AM Nick Allen <ni...@nickallen.org> wrote:

> +1 for option 3.  I am in favor of using Docker for the integration tests
> for all the reasons that you mentioned.
>
> On Fri, Mar 8, 2019 at 9:47 AM Ryan Merriman <me...@gmail.com> wrote:
>
> > I have been researching the effort involved to upgrade to HDP 3.  Along
> the
> > way I've found a couple challenging issues that we will need to solve,
> both
> > involving our integration testing strategy.
> >
> > The first issue is Kafka.  We are moving from 0.10.0 to 2.0.0 and there
> > have been significant changes to the API.  This creates an issue in the
> > KafkaComponent class, which we use as an in-memory Kafka server in
> > integration tests.  Most of the classes that were previously used have
> gone
> > away, and to the best of my knowledge, were not supported as public APIs.
> > I also don't see any publicly documented APIs to replace them.
> >
> > The second issue is HBase.  We are moving from 1.1.2 to 2.0.2 so another
> > significant change.  This creates an issue in the MockHTable class
> > becausethe HTableInterface class has changed to Table, essentially
> > requiring that MockHTable be rewritten to conform to the new interface.
> > It's my opinion that this class is complicated and difficult to maintain
> as
> > it is anyways.
> >
> > These 2 issues have the potential to add a significant amount of work to
> > upgrading Metron to HDP 3.  I want to take a step back and review our
> > options before we move forward.  Here are some initial thoughts I had on
> > how to approach this.  For HBase:
> >
> >    1. Update MockHTable to work with the new HBase API.  We would
> continue
> >    using a mock server approach for HBase.
> >    2. Research replacing MockHTable with an in-memory HBase server.
> >    3. Replace MockHTable with a Docker container running HBase.
> >
> > For Kafka:
> >
> >    1. Replace KafkaComponent with a mock server implementation.
> >    2. Update KafkaComponent to work with the new API.  We would probably
> >    need to leverage some internal Kafka classes.  I do not see a testing
> > API
> >    documented publicly.
> >    3. Replace KafkaComponent with a Docker container running Kafka.
> >
> > What other options are there?  Whatever we choose I think we should
> follow
> > a similar approach for both (mock servers, in memory servers, Docker,
> other
> > options I'm not thinking of).
> >
> > This will not shock anyone but I would be in favor of Docker containers.
> > They have the advantage of classpath isolation, easy upgrades, and
> accurate
> > integration testing.  The downside is we will have to adjusts our tests
> and
> > travis script to incorporate these Docker containers into our build
> > process.  We have discussed this at length in the past and it has
> generally
> > stalled for various reasons.  Maybe if we move a few services at a time
> it
> > might be more palatable?  As for the other 2 approaches, I think if
> either
> > worked well we wouldn't be having this discussion.  Mock servers are hard
> > to maintain and I don't see in memory testing classes documented in
> > javadocs for either service.
> >
> > Thoughts?
> >
>
-- 

Jon Zeolla

Re: [DISCUSS] Upgrading HBase and Kafka support

Posted by Nick Allen <ni...@nickallen.org>.
+1 for option 3.  I am in favor of using Docker for the integration tests
for all the reasons that you mentioned.

On Fri, Mar 8, 2019 at 9:47 AM Ryan Merriman <me...@gmail.com> wrote:

> I have been researching the effort involved to upgrade to HDP 3.  Along the
> way I've found a couple challenging issues that we will need to solve, both
> involving our integration testing strategy.
>
> The first issue is Kafka.  We are moving from 0.10.0 to 2.0.0 and there
> have been significant changes to the API.  This creates an issue in the
> KafkaComponent class, which we use as an in-memory Kafka server in
> integration tests.  Most of the classes that were previously used have gone
> away, and to the best of my knowledge, were not supported as public APIs.
> I also don't see any publicly documented APIs to replace them.
>
> The second issue is HBase.  We are moving from 1.1.2 to 2.0.2 so another
> significant change.  This creates an issue in the MockHTable class
> becausethe HTableInterface class has changed to Table, essentially
> requiring that MockHTable be rewritten to conform to the new interface.
> It's my opinion that this class is complicated and difficult to maintain as
> it is anyways.
>
> These 2 issues have the potential to add a significant amount of work to
> upgrading Metron to HDP 3.  I want to take a step back and review our
> options before we move forward.  Here are some initial thoughts I had on
> how to approach this.  For HBase:
>
>    1. Update MockHTable to work with the new HBase API.  We would continue
>    using a mock server approach for HBase.
>    2. Research replacing MockHTable with an in-memory HBase server.
>    3. Replace MockHTable with a Docker container running HBase.
>
> For Kafka:
>
>    1. Replace KafkaComponent with a mock server implementation.
>    2. Update KafkaComponent to work with the new API.  We would probably
>    need to leverage some internal Kafka classes.  I do not see a testing
> API
>    documented publicly.
>    3. Replace KafkaComponent with a Docker container running Kafka.
>
> What other options are there?  Whatever we choose I think we should follow
> a similar approach for both (mock servers, in memory servers, Docker, other
> options I'm not thinking of).
>
> This will not shock anyone but I would be in favor of Docker containers.
> They have the advantage of classpath isolation, easy upgrades, and accurate
> integration testing.  The downside is we will have to adjusts our tests and
> travis script to incorporate these Docker containers into our build
> process.  We have discussed this at length in the past and it has generally
> stalled for various reasons.  Maybe if we move a few services at a time it
> might be more palatable?  As for the other 2 approaches, I think if either
> worked well we wouldn't be having this discussion.  Mock servers are hard
> to maintain and I don't see in memory testing classes documented in
> javadocs for either service.
>
> Thoughts?
>

Re: [DISCUSS] Upgrading HBase and Kafka support

Posted by Simon Elliston Ball <si...@simonellistonball.com>.
The Docker option sounds like a much better and cleaner option for integration testing (closer to real too). My one question would be whether this would significantly increase test run time, and whether that would need Travis changes? 

Either way, the docker option sounds best.

Simon

> On 8 Mar 2019, at 16:38, Michael Miklavcic <mi...@gmail.com> wrote:
> 
> I'm -1 on #1 unless there's some desperately compelling reason to go that
> route. It would be a regression in our test coverage, and at that point
> it's really just duplicating our unit tests as opposed to checking our
> integration.
> 
> I'm good with 3. Gating factors for a successful implementation would be
> that as a developer I can:
> 
>   1. Run it in my IDE without having to do anything extra (the beauty of
>   the in-mem component is that @BeforeClass spins it up automatically - we
>   should keep doing something along those lines)
>   2. Run it via Maven cli
>   3. Run it in Travis as part of our normal build
> 
> It's probably worth looking at Kafka's testing infrastructure straight from
> the source - https://github.com/apache/kafka/blob/trunk/tests/README.md.
> They leverage Docker containers now for system tests.
> 
> Best,
> Mike
> 
> 
>> On Fri, Mar 8, 2019 at 7:47 AM Ryan Merriman <me...@gmail.com> wrote:
>> 
>> I have been researching the effort involved to upgrade to HDP 3.  Along the
>> way I've found a couple challenging issues that we will need to solve, both
>> involving our integration testing strategy.
>> 
>> The first issue is Kafka.  We are moving from 0.10.0 to 2.0.0 and there
>> have been significant changes to the API.  This creates an issue in the
>> KafkaComponent class, which we use as an in-memory Kafka server in
>> integration tests.  Most of the classes that were previously used have gone
>> away, and to the best of my knowledge, were not supported as public APIs.
>> I also don't see any publicly documented APIs to replace them.
>> 
>> The second issue is HBase.  We are moving from 1.1.2 to 2.0.2 so another
>> significant change.  This creates an issue in the MockHTable class
>> becausethe HTableInterface class has changed to Table, essentially
>> requiring that MockHTable be rewritten to conform to the new interface.
>> It's my opinion that this class is complicated and difficult to maintain as
>> it is anyways.
>> 
>> These 2 issues have the potential to add a significant amount of work to
>> upgrading Metron to HDP 3.  I want to take a step back and review our
>> options before we move forward.  Here are some initial thoughts I had on
>> how to approach this.  For HBase:
>> 
>>   1. Update MockHTable to work with the new HBase API.  We would continue
>>   using a mock server approach for HBase.
>>   2. Research replacing MockHTable with an in-memory HBase server.
>>   3. Replace MockHTable with a Docker container running HBase.
>> 
>> For Kafka:
>> 
>>   1. Replace KafkaComponent with a mock server implementation.
>>   2. Update KafkaComponent to work with the new API.  We would probably
>>   need to leverage some internal Kafka classes.  I do not see a testing
>> API
>>   documented publicly.
>>   3. Replace KafkaComponent with a Docker container running Kafka.
>> 
>> What other options are there?  Whatever we choose I think we should follow
>> a similar approach for both (mock servers, in memory servers, Docker, other
>> options I'm not thinking of).
>> 
>> This will not shock anyone but I would be in favor of Docker containers.
>> They have the advantage of classpath isolation, easy upgrades, and accurate
>> integration testing.  The downside is we will have to adjusts our tests and
>> travis script to incorporate these Docker containers into our build
>> process.  We have discussed this at length in the past and it has generally
>> stalled for various reasons.  Maybe if we move a few services at a time it
>> might be more palatable?  As for the other 2 approaches, I think if either
>> worked well we wouldn't be having this discussion.  Mock servers are hard
>> to maintain and I don't see in memory testing classes documented in
>> javadocs for either service.
>> 
>> Thoughts?
>> 

Re: [DISCUSS] Upgrading HBase and Kafka support

Posted by Kyle Richardson <ky...@gmail.com>.
+1 to Docker and looking at testcontainers.org

On Fri, Mar 8, 2019 at 3:03 PM Otto Fowler <ot...@gmail.com> wrote:

> https://github.com/apache/metron-bro-plugin-kafka/tree/master/docker
>
>
> On March 8, 2019 at 14:28:20, Zeolla@GMail.com (zeolla@gmail.com) wrote:
>
> So most importantly I want to make sure to give Otto credit for being the
> one who cleaned up the rudimentary testing steps we had for testing the
> plugin and turned it into the docker end to end. Right now we manually run
> the tests, as there were a few follow-ons we needed to work through before
> it's ready for Travis. In my opinion, once METRON-2003 (PR 26) gets in
> it'll be ready to have Travis. There isn't any current Maven use
>
> Jon
>
> On Fri, Mar 8, 2019 at 12:26 PM Otto Fowler <ot...@gmail.com>
> wrote:
>
> > I believe that the TestContainers allows the ide case
> >
> >
> > On March 8, 2019 at 11:38:24, Michael Miklavcic (
> > michael.miklavcic@gmail.com)
> > wrote:
> >
> > I'm -1 on #1 unless there's some desperately compelling reason to go that
> > route. It would be a regression in our test coverage, and at that point
> > it's really just duplicating our unit tests as opposed to checking our
> > integration.
> >
> > I'm good with 3. Gating factors for a successful implementation would be
> > that as a developer I can:
> >
> > 1. Run it in my IDE without having to do anything extra (the beauty of
> > the in-mem component is that @BeforeClass spins it up automatically - we
> > should keep doing something along those lines)
> > 2. Run it via Maven cli
> > 3. Run it in Travis as part of our normal build
> >
> > It's probably worth looking at Kafka's testing infrastructure straight
> from
> > the source - https://github.com/apache/kafka/blob/trunk/tests/README.md.
> > They leverage Docker containers now for system tests.
> >
> > Best,
> > Mike
> >
> >
> > On Fri, Mar 8, 2019 at 7:47 AM Ryan Merriman <me...@gmail.com>
> wrote:
> >
> > > I have been researching the effort involved to upgrade to HDP 3. Along
> > the
> > > way I've found a couple challenging issues that we will need to solve,
> > both
> > > involving our integration testing strategy.
> > >
> > > The first issue is Kafka. We are moving from 0.10.0 to 2.0.0 and there
> > > have been significant changes to the API. This creates an issue in the
> > > KafkaComponent class, which we use as an in-memory Kafka server in
> > > integration tests. Most of the classes that were previously used have
> > gone
> > > away, and to the best of my knowledge, were not supported as public
> APIs.
> > > I also don't see any publicly documented APIs to replace them.
> > >
> > > The second issue is HBase. We are moving from 1.1.2 to 2.0.2 so another
> > > significant change. This creates an issue in the MockHTable class
> > > becausethe HTableInterface class has changed to Table, essentially
> > > requiring that MockHTable be rewritten to conform to the new interface.
> > > It's my opinion that this class is complicated and difficult to
> maintain
> > as
> > > it is anyways.
> > >
> > > These 2 issues have the potential to add a significant amount of work
> to
> > > upgrading Metron to HDP 3. I want to take a step back and review our
> > > options before we move forward. Here are some initial thoughts I had on
> > > how to approach this. For HBase:
> > >
> > > 1. Update MockHTable to work with the new HBase API. We would continue
> > > using a mock server approach for HBase.
> > > 2. Research replacing MockHTable with an in-memory HBase server.
> > > 3. Replace MockHTable with a Docker container running HBase.
> > >
> > > For Kafka:
> > >
> > > 1. Replace KafkaComponent with a mock server implementation.
> > > 2. Update KafkaComponent to work with the new API. We would probably
> > > need to leverage some internal Kafka classes. I do not see a testing
> > > API
> > > documented publicly.
> > > 3. Replace KafkaComponent with a Docker container running Kafka.
> > >
> > > What other options are there? Whatever we choose I think we should
> follow
> > > a similar approach for both (mock servers, in memory servers, Docker,
> > other
> > > options I'm not thinking of).
> > >
> > > This will not shock anyone but I would be in favor of Docker
> containers.
> > > They have the advantage of classpath isolation, easy upgrades, and
> > accurate
> > > integration testing. The downside is we will have to adjusts our tests
> > and
> > > travis script to incorporate these Docker containers into our build
> > > process. We have discussed this at length in the past and it has
> > generally
> > > stalled for various reasons. Maybe if we move a few services at a time
> it
> > > might be more palatable? As for the other 2 approaches, I think if
> either
> > > worked well we wouldn't be having this discussion. Mock servers are
> hard
> > > to maintain and I don't see in memory testing classes documented in
> > > javadocs for either service.
> > >
> > > Thoughts?
> > >
> >
> --
>
> Jon Zeolla
>
-- 
-Kyle

Re: [DISCUSS] Upgrading HBase and Kafka support

Posted by Otto Fowler <ot...@gmail.com>.
https://github.com/apache/metron-bro-plugin-kafka/tree/master/docker


On March 8, 2019 at 14:28:20, Zeolla@GMail.com (zeolla@gmail.com) wrote:

So most importantly I want to make sure to give Otto credit for being the
one who cleaned up the rudimentary testing steps we had for testing the
plugin and turned it into the docker end to end. Right now we manually run
the tests, as there were a few follow-ons we needed to work through before
it's ready for Travis. In my opinion, once METRON-2003 (PR 26) gets in
it'll be ready to have Travis. There isn't any current Maven use

Jon

On Fri, Mar 8, 2019 at 12:26 PM Otto Fowler <ot...@gmail.com>
wrote:

> I believe that the TestContainers allows the ide case
>
>
> On March 8, 2019 at 11:38:24, Michael Miklavcic (
> michael.miklavcic@gmail.com)
> wrote:
>
> I'm -1 on #1 unless there's some desperately compelling reason to go that
> route. It would be a regression in our test coverage, and at that point
> it's really just duplicating our unit tests as opposed to checking our
> integration.
>
> I'm good with 3. Gating factors for a successful implementation would be
> that as a developer I can:
>
> 1. Run it in my IDE without having to do anything extra (the beauty of
> the in-mem component is that @BeforeClass spins it up automatically - we
> should keep doing something along those lines)
> 2. Run it via Maven cli
> 3. Run it in Travis as part of our normal build
>
> It's probably worth looking at Kafka's testing infrastructure straight
from
> the source - https://github.com/apache/kafka/blob/trunk/tests/README.md.
> They leverage Docker containers now for system tests.
>
> Best,
> Mike
>
>
> On Fri, Mar 8, 2019 at 7:47 AM Ryan Merriman <me...@gmail.com> wrote:
>
> > I have been researching the effort involved to upgrade to HDP 3. Along
> the
> > way I've found a couple challenging issues that we will need to solve,
> both
> > involving our integration testing strategy.
> >
> > The first issue is Kafka. We are moving from 0.10.0 to 2.0.0 and there
> > have been significant changes to the API. This creates an issue in the
> > KafkaComponent class, which we use as an in-memory Kafka server in
> > integration tests. Most of the classes that were previously used have
> gone
> > away, and to the best of my knowledge, were not supported as public
APIs.
> > I also don't see any publicly documented APIs to replace them.
> >
> > The second issue is HBase. We are moving from 1.1.2 to 2.0.2 so another
> > significant change. This creates an issue in the MockHTable class
> > becausethe HTableInterface class has changed to Table, essentially
> > requiring that MockHTable be rewritten to conform to the new interface.
> > It's my opinion that this class is complicated and difficult to
maintain
> as
> > it is anyways.
> >
> > These 2 issues have the potential to add a significant amount of work
to
> > upgrading Metron to HDP 3. I want to take a step back and review our
> > options before we move forward. Here are some initial thoughts I had on
> > how to approach this. For HBase:
> >
> > 1. Update MockHTable to work with the new HBase API. We would continue
> > using a mock server approach for HBase.
> > 2. Research replacing MockHTable with an in-memory HBase server.
> > 3. Replace MockHTable with a Docker container running HBase.
> >
> > For Kafka:
> >
> > 1. Replace KafkaComponent with a mock server implementation.
> > 2. Update KafkaComponent to work with the new API. We would probably
> > need to leverage some internal Kafka classes. I do not see a testing
> > API
> > documented publicly.
> > 3. Replace KafkaComponent with a Docker container running Kafka.
> >
> > What other options are there? Whatever we choose I think we should
follow
> > a similar approach for both (mock servers, in memory servers, Docker,
> other
> > options I'm not thinking of).
> >
> > This will not shock anyone but I would be in favor of Docker
containers.
> > They have the advantage of classpath isolation, easy upgrades, and
> accurate
> > integration testing. The downside is we will have to adjusts our tests
> and
> > travis script to incorporate these Docker containers into our build
> > process. We have discussed this at length in the past and it has
> generally
> > stalled for various reasons. Maybe if we move a few services at a time
it
> > might be more palatable? As for the other 2 approaches, I think if
either
> > worked well we wouldn't be having this discussion. Mock servers are
hard
> > to maintain and I don't see in memory testing classes documented in
> > javadocs for either service.
> >
> > Thoughts?
> >
>
-- 

Jon Zeolla

Re: [DISCUSS] Upgrading HBase and Kafka support

Posted by "Zeolla@GMail.com" <ze...@gmail.com>.
So most importantly I want to make sure to give Otto credit for being the
one who cleaned up the rudimentary testing steps we had for testing the
plugin and turned it into the docker end to end.  Right now we manually run
the tests, as there were a few follow-ons we needed to work through before
it's ready for Travis.  In my opinion, once METRON-2003 (PR 26) gets in
it'll be ready to have Travis.  There isn't any current Maven use

Jon

On Fri, Mar 8, 2019 at 12:26 PM Otto Fowler <ot...@gmail.com> wrote:

> I believe that the TestContainers allows the ide case
>
>
> On March 8, 2019 at 11:38:24, Michael Miklavcic (
> michael.miklavcic@gmail.com)
> wrote:
>
> I'm -1 on #1 unless there's some desperately compelling reason to go that
> route. It would be a regression in our test coverage, and at that point
> it's really just duplicating our unit tests as opposed to checking our
> integration.
>
> I'm good with 3. Gating factors for a successful implementation would be
> that as a developer I can:
>
> 1. Run it in my IDE without having to do anything extra (the beauty of
> the in-mem component is that @BeforeClass spins it up automatically - we
> should keep doing something along those lines)
> 2. Run it via Maven cli
> 3. Run it in Travis as part of our normal build
>
> It's probably worth looking at Kafka's testing infrastructure straight from
> the source - https://github.com/apache/kafka/blob/trunk/tests/README.md.
> They leverage Docker containers now for system tests.
>
> Best,
> Mike
>
>
> On Fri, Mar 8, 2019 at 7:47 AM Ryan Merriman <me...@gmail.com> wrote:
>
> > I have been researching the effort involved to upgrade to HDP 3. Along
> the
> > way I've found a couple challenging issues that we will need to solve,
> both
> > involving our integration testing strategy.
> >
> > The first issue is Kafka. We are moving from 0.10.0 to 2.0.0 and there
> > have been significant changes to the API. This creates an issue in the
> > KafkaComponent class, which we use as an in-memory Kafka server in
> > integration tests. Most of the classes that were previously used have
> gone
> > away, and to the best of my knowledge, were not supported as public APIs.
> > I also don't see any publicly documented APIs to replace them.
> >
> > The second issue is HBase. We are moving from 1.1.2 to 2.0.2 so another
> > significant change. This creates an issue in the MockHTable class
> > becausethe HTableInterface class has changed to Table, essentially
> > requiring that MockHTable be rewritten to conform to the new interface.
> > It's my opinion that this class is complicated and difficult to maintain
> as
> > it is anyways.
> >
> > These 2 issues have the potential to add a significant amount of work to
> > upgrading Metron to HDP 3. I want to take a step back and review our
> > options before we move forward. Here are some initial thoughts I had on
> > how to approach this. For HBase:
> >
> > 1. Update MockHTable to work with the new HBase API. We would continue
> > using a mock server approach for HBase.
> > 2. Research replacing MockHTable with an in-memory HBase server.
> > 3. Replace MockHTable with a Docker container running HBase.
> >
> > For Kafka:
> >
> > 1. Replace KafkaComponent with a mock server implementation.
> > 2. Update KafkaComponent to work with the new API. We would probably
> > need to leverage some internal Kafka classes. I do not see a testing
> > API
> > documented publicly.
> > 3. Replace KafkaComponent with a Docker container running Kafka.
> >
> > What other options are there? Whatever we choose I think we should follow
> > a similar approach for both (mock servers, in memory servers, Docker,
> other
> > options I'm not thinking of).
> >
> > This will not shock anyone but I would be in favor of Docker containers.
> > They have the advantage of classpath isolation, easy upgrades, and
> accurate
> > integration testing. The downside is we will have to adjusts our tests
> and
> > travis script to incorporate these Docker containers into our build
> > process. We have discussed this at length in the past and it has
> generally
> > stalled for various reasons. Maybe if we move a few services at a time it
> > might be more palatable? As for the other 2 approaches, I think if either
> > worked well we wouldn't be having this discussion. Mock servers are hard
> > to maintain and I don't see in memory testing classes documented in
> > javadocs for either service.
> >
> > Thoughts?
> >
>
-- 

Jon Zeolla

Re: [DISCUSS] Upgrading HBase and Kafka support

Posted by Otto Fowler <ot...@gmail.com>.
I believe that the TestContainers allows the ide case


On March 8, 2019 at 11:38:24, Michael Miklavcic (michael.miklavcic@gmail.com)
wrote:

I'm -1 on #1 unless there's some desperately compelling reason to go that
route. It would be a regression in our test coverage, and at that point
it's really just duplicating our unit tests as opposed to checking our
integration.

I'm good with 3. Gating factors for a successful implementation would be
that as a developer I can:

1. Run it in my IDE without having to do anything extra (the beauty of
the in-mem component is that @BeforeClass spins it up automatically - we
should keep doing something along those lines)
2. Run it via Maven cli
3. Run it in Travis as part of our normal build

It's probably worth looking at Kafka's testing infrastructure straight from
the source - https://github.com/apache/kafka/blob/trunk/tests/README.md.
They leverage Docker containers now for system tests.

Best,
Mike


On Fri, Mar 8, 2019 at 7:47 AM Ryan Merriman <me...@gmail.com> wrote:

> I have been researching the effort involved to upgrade to HDP 3. Along
the
> way I've found a couple challenging issues that we will need to solve,
both
> involving our integration testing strategy.
>
> The first issue is Kafka. We are moving from 0.10.0 to 2.0.0 and there
> have been significant changes to the API. This creates an issue in the
> KafkaComponent class, which we use as an in-memory Kafka server in
> integration tests. Most of the classes that were previously used have
gone
> away, and to the best of my knowledge, were not supported as public APIs.
> I also don't see any publicly documented APIs to replace them.
>
> The second issue is HBase. We are moving from 1.1.2 to 2.0.2 so another
> significant change. This creates an issue in the MockHTable class
> becausethe HTableInterface class has changed to Table, essentially
> requiring that MockHTable be rewritten to conform to the new interface.
> It's my opinion that this class is complicated and difficult to maintain
as
> it is anyways.
>
> These 2 issues have the potential to add a significant amount of work to
> upgrading Metron to HDP 3. I want to take a step back and review our
> options before we move forward. Here are some initial thoughts I had on
> how to approach this. For HBase:
>
> 1. Update MockHTable to work with the new HBase API. We would continue
> using a mock server approach for HBase.
> 2. Research replacing MockHTable with an in-memory HBase server.
> 3. Replace MockHTable with a Docker container running HBase.
>
> For Kafka:
>
> 1. Replace KafkaComponent with a mock server implementation.
> 2. Update KafkaComponent to work with the new API. We would probably
> need to leverage some internal Kafka classes. I do not see a testing
> API
> documented publicly.
> 3. Replace KafkaComponent with a Docker container running Kafka.
>
> What other options are there? Whatever we choose I think we should follow
> a similar approach for both (mock servers, in memory servers, Docker,
other
> options I'm not thinking of).
>
> This will not shock anyone but I would be in favor of Docker containers.
> They have the advantage of classpath isolation, easy upgrades, and
accurate
> integration testing. The downside is we will have to adjusts our tests
and
> travis script to incorporate these Docker containers into our build
> process. We have discussed this at length in the past and it has
generally
> stalled for various reasons. Maybe if we move a few services at a time it
> might be more palatable? As for the other 2 approaches, I think if either
> worked well we wouldn't be having this discussion. Mock servers are hard
> to maintain and I don't see in memory testing classes documented in
> javadocs for either service.
>
> Thoughts?
>

Re: [DISCUSS] Upgrading HBase and Kafka support

Posted by Michael Miklavcic <mi...@gmail.com>.
I'm -1 on #1 unless there's some desperately compelling reason to go that
route. It would be a regression in our test coverage, and at that point
it's really just duplicating our unit tests as opposed to checking our
integration.

I'm good with 3. Gating factors for a successful implementation would be
that as a developer I can:

   1. Run it in my IDE without having to do anything extra (the beauty of
   the in-mem component is that @BeforeClass spins it up automatically - we
   should keep doing something along those lines)
   2. Run it via Maven cli
   3. Run it in Travis as part of our normal build

It's probably worth looking at Kafka's testing infrastructure straight from
the source - https://github.com/apache/kafka/blob/trunk/tests/README.md.
They leverage Docker containers now for system tests.

Best,
Mike


On Fri, Mar 8, 2019 at 7:47 AM Ryan Merriman <me...@gmail.com> wrote:

> I have been researching the effort involved to upgrade to HDP 3.  Along the
> way I've found a couple challenging issues that we will need to solve, both
> involving our integration testing strategy.
>
> The first issue is Kafka.  We are moving from 0.10.0 to 2.0.0 and there
> have been significant changes to the API.  This creates an issue in the
> KafkaComponent class, which we use as an in-memory Kafka server in
> integration tests.  Most of the classes that were previously used have gone
> away, and to the best of my knowledge, were not supported as public APIs.
> I also don't see any publicly documented APIs to replace them.
>
> The second issue is HBase.  We are moving from 1.1.2 to 2.0.2 so another
> significant change.  This creates an issue in the MockHTable class
> becausethe HTableInterface class has changed to Table, essentially
> requiring that MockHTable be rewritten to conform to the new interface.
> It's my opinion that this class is complicated and difficult to maintain as
> it is anyways.
>
> These 2 issues have the potential to add a significant amount of work to
> upgrading Metron to HDP 3.  I want to take a step back and review our
> options before we move forward.  Here are some initial thoughts I had on
> how to approach this.  For HBase:
>
>    1. Update MockHTable to work with the new HBase API.  We would continue
>    using a mock server approach for HBase.
>    2. Research replacing MockHTable with an in-memory HBase server.
>    3. Replace MockHTable with a Docker container running HBase.
>
> For Kafka:
>
>    1. Replace KafkaComponent with a mock server implementation.
>    2. Update KafkaComponent to work with the new API.  We would probably
>    need to leverage some internal Kafka classes.  I do not see a testing
> API
>    documented publicly.
>    3. Replace KafkaComponent with a Docker container running Kafka.
>
> What other options are there?  Whatever we choose I think we should follow
> a similar approach for both (mock servers, in memory servers, Docker, other
> options I'm not thinking of).
>
> This will not shock anyone but I would be in favor of Docker containers.
> They have the advantage of classpath isolation, easy upgrades, and accurate
> integration testing.  The downside is we will have to adjusts our tests and
> travis script to incorporate these Docker containers into our build
> process.  We have discussed this at length in the past and it has generally
> stalled for various reasons.  Maybe if we move a few services at a time it
> might be more palatable?  As for the other 2 approaches, I think if either
> worked well we wouldn't be having this discussion.  Mock servers are hard
> to maintain and I don't see in memory testing classes documented in
> javadocs for either service.
>
> Thoughts?
>

Re: [DISCUSS] Upgrading HBase and Kafka support

Posted by Otto Fowler <ot...@gmail.com>.
I think I have mentioned it before, but https://www.testcontainers.org could
be a viable approach for this methodology (3).
I would think it would be worth looking at.



On March 8, 2019 at 09:47:54, Ryan Merriman (merrimanr@gmail.com) wrote:

I have been researching the effort involved to upgrade to HDP 3. Along the
way I've found a couple challenging issues that we will need to solve, both
involving our integration testing strategy.

The first issue is Kafka. We are moving from 0.10.0 to 2.0.0 and there
have been significant changes to the API. This creates an issue in the
KafkaComponent class, which we use as an in-memory Kafka server in
integration tests. Most of the classes that were previously used have gone
away, and to the best of my knowledge, were not supported as public APIs.
I also don't see any publicly documented APIs to replace them.

The second issue is HBase. We are moving from 1.1.2 to 2.0.2 so another
significant change. This creates an issue in the MockHTable class
becausethe HTableInterface class has changed to Table, essentially
requiring that MockHTable be rewritten to conform to the new interface.
It's my opinion that this class is complicated and difficult to maintain as
it is anyways.

These 2 issues have the potential to add a significant amount of work to
upgrading Metron to HDP 3. I want to take a step back and review our
options before we move forward. Here are some initial thoughts I had on
how to approach this. For HBase:

1. Update MockHTable to work with the new HBase API. We would continue
using a mock server approach for HBase.
2. Research replacing MockHTable with an in-memory HBase server.
3. Replace MockHTable with a Docker container running HBase.

For Kafka:

1. Replace KafkaComponent with a mock server implementation.
2. Update KafkaComponent to work with the new API. We would probably
need to leverage some internal Kafka classes. I do not see a testing API
documented publicly.
3. Replace KafkaComponent with a Docker container running Kafka.

What other options are there? Whatever we choose I think we should follow
a similar approach for both (mock servers, in memory servers, Docker, other
options I'm not thinking of).

This will not shock anyone but I would be in favor of Docker containers.
They have the advantage of classpath isolation, easy upgrades, and accurate
integration testing. The downside is we will have to adjusts our tests and
travis script to incorporate these Docker containers into our build
process. We have discussed this at length in the past and it has generally
stalled for various reasons. Maybe if we move a few services at a time it
might be more palatable? As for the other 2 approaches, I think if either
worked well we wouldn't be having this discussion. Mock servers are hard
to maintain and I don't see in memory testing classes documented in
javadocs for either service.

Thoughts?