You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by pnowojski <gi...@git.apache.org> on 2017/07/12 12:28:43 UTC

[GitHub] flink pull request #4310: [misc] Commit read offsets in Kafka integration te...

GitHub user pnowojski opened a pull request:

    https://github.com/apache/flink/pull/4310

    [misc] Commit read offsets in Kafka integration tests

    Previously offsets were not committed so the same records could be read more then once.
    It was not a big issue, because so far this methods were used only for at-least-once tests.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/pnowojski/flink offsets

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/4310.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4310
    
----
commit 615eba8a10fc10ac3a5c23bedc2acc197d8cca22
Author: Piotr Nowojski <pi...@gmail.com>
Date:   2017-07-12T12:01:49Z

    [misc] Commit read offsets in Kafka integration tests
    
    Previously offsets were not commited so the same records could be read more then once.
    It was not a big issue, because so far this methods were used only for at-least-once tests.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4310: [misc] Commit read offsets in Kafka integration tests

Posted by tzulitai <gi...@git.apache.org>.
Github user tzulitai commented on the issue:

    https://github.com/apache/flink/pull/4310
  
    Maybe some method Javadoc explaining that would be nice.
    From the method name `getAllRecordsFromTopic`, the behaviour isn't that obvious.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #4310: [misc] Commit read offsets in Kafka integration te...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/flink/pull/4310


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4310: [misc] Commit read offsets in Kafka integration tests

Posted by pnowojski <gi...@git.apache.org>.
Github user pnowojski commented on the issue:

    https://github.com/apache/flink/pull/4310
  
    yes, Travis passes @tzulitai :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4310: [misc] Commit read offsets in Kafka integration tests

Posted by tzulitai <gi...@git.apache.org>.
Github user tzulitai commented on the issue:

    https://github.com/apache/flink/pull/4310
  
    Ok :) LGTM, merging ..


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4310: [misc] Commit read offsets in Kafka integration tests

Posted by pnowojski <gi...@git.apache.org>.
Github user pnowojski commented on the issue:

    https://github.com/apache/flink/pull/4310
  
    For consumer side or mapper side it is natural to use that kind of validating mappers, because you could just add them at the end of your pipeline. 
    
    For producers tests it isn't, because you need to spawn additional Flink job for this purpose, which seems unnatural to me. It would add a test dependency to a consumer code (bug in consumer would/could brake producer tests making the error messages very confusing). Furthermore using second Flink job would be definitely more heavy and more time/resources consuming - this second job would need to execute exactly same code as those methods, but wrapped into additional layer (Flink application). Lastly this wrapping would add additional complexity that could make this tests more prone for intermittent failures and timeouts. 
    
    If you have the data written somewhere, why don't you want to read them directly? One more bonus reason for doing it as it is, it makes possible to test producers without spawning any Flink job altogether in some mini IT cases (which I'm doing in tests for `Kafka011`, I test `FlinkKafkaProducer` directly).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4310: [misc] Commit read offsets in Kafka integration tests

Posted by pnowojski <gi...@git.apache.org>.
Github user pnowojski commented on the issue:

    https://github.com/apache/flink/pull/4310
  
    I didn't know that you can have disconnected graph in Flink :)
    
    It shouldn't be caused by this commit, since it is included in my other PR. Rebased and let's make sure that it passes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4310: [misc] Commit read offsets in Kafka integration tests

Posted by tzulitai <gi...@git.apache.org>.
Github user tzulitai commented on the issue:

    https://github.com/apache/flink/pull/4310
  
    I would like to revisit tests that use this method by first discussing:
    wouldn't it be more appropriate to have a validating mapper function that throws a `SuccessException` once it sees all records? And instead of having some timeout in this method, we have a timeout set on the test method.
    This test pattern is consistently used by the integration tests for the Kafka consumer (and many other places too).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4310: [misc] Commit read offsets in Kafka integration tests

Posted by tzulitai <gi...@git.apache.org>.
Github user tzulitai commented on the issue:

    https://github.com/apache/flink/pull/4310
  
    Travis seems to have a large amount of abnormal timeouts, though. I'm not sure if it is really related to this change or otherwise. Could you do a rebase on the latest master so that the recent Travis build changes are included, and we wait for another Travis run?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4310: [misc] Commit read offsets in Kafka integration tests

Posted by pnowojski <gi...@git.apache.org>.
Github user pnowojski commented on the issue:

    https://github.com/apache/flink/pull/4310
  
    Yes and as far as I know, we are doing this. Why do you ask?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4310: [misc] Commit read offsets in Kafka integration tests

Posted by tzulitai <gi...@git.apache.org>.
Github user tzulitai commented on the issue:

    https://github.com/apache/flink/pull/4310
  
    Shouldn't we actually be deleting the topics after the test finishes?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4310: [misc] Commit read offsets in Kafka integration tests

Posted by pnowojski <gi...@git.apache.org>.
Github user pnowojski commented on the issue:

    https://github.com/apache/flink/pull/4310
  
    Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4310: [misc] Commit read offsets in Kafka integration tests

Posted by tzulitai <gi...@git.apache.org>.
Github user tzulitai commented on the issue:

    https://github.com/apache/flink/pull/4310
  
    @pnowojski only checking that we're actually not reading the same topic again in the tests.
    So, the change in this PR is just to make sure that in the case we do do that using the `getAllRecordsFromTopic` method, we don't re-read data for verification, correct?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4310: [misc] Commit read offsets in Kafka integration tests

Posted by tzulitai <gi...@git.apache.org>.
Github user tzulitai commented on the issue:

    https://github.com/apache/flink/pull/4310
  
    @pnowojski alright, that makes sense.
    You don't actually need a separate Flink job because you can just add a completely non-attached graph that consumes from the topic within the same job. But I agree that it complicates the test scenario and would also have to consider bugs in the consumer.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---