You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by ewencp <gi...@git.apache.org> on 2015/07/27 08:54:53 UTC

[GitHub] kafka pull request: KAFKA-2366 [WIP]; Copycat

GitHub user ewencp opened a pull request:

    https://github.com/apache/kafka/pull/99

    KAFKA-2366 [WIP]; Copycat

    This is an initial patch implementing the basics of Copycat for KIP-26.
    
    The intent here is to start a review of the key pieces of the core API and get a reasonably functional, baseline, non-distributed implementation of Copycat in place to get things rolling. The current patch has a number of known issues that need to be addressed before a final version:
    
    * Some build-related issues. Specifically, requires some locally-installed dependencies (see below), ignores checkstyle for the runtime data library because it's lifted from Avro currently and likely won't last in its current form, and some Gradle task dependencies aren't quite right because I haven't gotten rid of the dependency on `core` (which should now be an easy patch since new consumer groups are in a much better state).
    * This patch currently depends on some Confluent trunk code because I prototyped with our Avro serializers w/ schema-registry support. We need to figure out what we want to provide as an example built-in set of serializers. Unlike core Kafka where we could ignore the issue, providing only ByteArray or String serializers, this is pretty central to how Copycat works.
    * This patch uses a hacked up version of Avro as its runtime data format. Not sure if we want to go through the entire API discussion just to get some basic code committed, so I filed KAFKA-2367 to handle that separately. The core connector APIs and the runtime data APIs are entirely orthogonal.
    * This patch needs some updates to get aligned with recent new consumer changes (specifically, I'm aware of the ConcurrentModificationException issue on exit). More generally, the new consumer is in flux but Copycat depends on it, so there are likely to be some negative interactions.
    * The layout feels a bit awkward to me right now because I ported it from a Maven layout. We don't have nearly the same level of granularity in Kafka currently (core and clients, plus the mostly ignored examples, log4j-appender, and a couple of contribs). We might want to reorganize, although keeping data+api separate from runtime and connector plugins is useful for minimizing dependencies.
    * There are a variety of other things (e.g., I'm not happy with the exception hierarchy/how they are currently handled, TopicPartition doesn't really need to be duplicated unless we want Copycat entirely isolated from the Kafka APIs, etc), but I expect those we'll cover in the review.
    
    Before commenting on the patch, it's probably worth reviewing https://issues.apache.org/jira/browse/KAFKA-2365 and https://issues.apache.org/jira/browse/KAFKA-2366 to get an idea of what I had in mind for a) what we ultimately want with all the Copycat patches and b) what we aim to cover in this initial patch. My hope is that we can use a WIP patch (after the current obvious deficiencies are addressed) while recognizing that we want to make iterative progress with a bunch of subsequent PRs.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ewencp/kafka copycat

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/kafka/pull/99.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #99
    
----
commit 11981d2eaa2f61e81251104d6051acf6fd3911b3
Author: Ewen Cheslack-Postava <me...@ewencp.org>
Date:   2015-07-24T20:20:15Z

    Add copycat-data and copycat-api

commit 0233456c297c79c8f351dc7683a12b491d5682e8
Author: Ewen Cheslack-Postava <me...@ewencp.org>
Date:   2015-07-24T21:59:54Z

    Add copycat-avro and copycat-runtime

commit e14942cb20952263c26540fc333b7e3dc624c09c
Author: Ewen Cheslack-Postava <me...@ewencp.org>
Date:   2015-07-25T02:52:47Z

    Add Copycat file connector.

commit 31cd1caf3c48417bcfb56b8c85dfd2419712953c
Author: Ewen Cheslack-Postava <me...@ewencp.org>
Date:   2015-07-26T20:48:00Z

    Add CLI tools for Copycat.

commit 4a9b4f3c671bbba3b5d05a2ac6fed65b018649ee
Author: Ewen Cheslack-Postava <me...@ewencp.org>
Date:   2015-07-26T21:03:52Z

    Add some helpful Copycat-specific build and test targets that cover all Copycat packages.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] kafka pull request: KAFKA-2366: Copycat

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/kafka/pull/99


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---