You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@samza.apache.org by Martin Kleppmann <mk...@linkedin.com> on 2014/04/02 18:19:27 UTC
Re: Review Request 19481: SAMZA-180: Command-line tool for manipulating
checkpoints
> On March 28, 2014, 9:44 p.m., Chris Riccomini wrote:
> > Overall, looks really good. Few minor nits, and some build.gradle cleanup.
> >
> > Also:
> >
> > 1. Could you add docs to the website on how to use this tool?
> > 2. Could you write some unit tests for CheckpointTool?
Ok, I've added some tests.
Not sure what the docs should look like. Right now you can't use the checkpoint-tool.sh shell script out of the box; and the way you use it in hello-samza depends on the way hello-samza is built. Should the documentation describe how to use the Gradle task? Or should we wait until we have a binary release (where users can just unpack the release and run bin/checkpoint-tool.sh without having to build anything)?
> On March 28, 2014, 9:44 p.m., Chris Riccomini wrote:
> > build.gradle, line 164
> > <https://reviews.apache.org/r/19481/diff/3/?file=531252#file531252line164>
> >
> > No need to make samza-shell depend on these for runtime. These instructions sound better:
> >
> > http://forums.gradle.org/gradle/topics/how_to_use_in_gradle_javaexec_with_classpath_dependency
> >
> > In short:
> >
> > configurations {
> > gradleShell
> > }
> >
> > dependencies {
> > gradleShell project(":samza-core_$scalaVersion")
> > gradleShell project(":samza-kafka_$scalaVersion")
> > gradleShell project(":samza-yarn_$scalaVersion")
> > gradleShell "org.slf4j:slf4j-simple:$slf4jVersion"
> > }
> >
> > ....
> >
> > classpath = configurations.gradleShell
> >
> > Will need to add slf4jVersion to gradle/dependency-versions.gradle.. also should update the sfl4j testRuntime configuration in samza-kafka to use slf4jVersion.
Ok, that sounds like a good solution. Made that change.
> On March 28, 2014, 9:44 p.m., Chris Riccomini wrote:
> > samza-core/src/main/resources/samza-cmdline-log4j.properties, line 1
> > <https://reviews.apache.org/r/19481/diff/3/?file=531253#file531253line1>
> >
> > This file isn't needed if you use slf4j-simple as described above. I have verified by removing the file and running with:
> >
> > $ ./gradlew checkpointTool
> > The TaskContainer.add() method has been deprecated and is scheduled to be removed in Gradle 2.0. Please use the create() method instead.
> > :samza-api:compileJava UP-TO-DATE
> > :samza-api:processResources UP-TO-DATE
> > :samza-api:classes UP-TO-DATE
> > :samza-api:jar UP-TO-DATE
> > :samza-core_2.10:compileJava UP-TO-DATE
> > :samza-core_2.10:compileScala UP-TO-DATE
> > :samza-core_2.10:processResources UP-TO-DATE
> > :samza-core_2.10:classes UP-TO-DATE
> > :samza-core_2.10:jar UP-TO-DATE
> > :samza-serializers_2.10:compileJava UP-TO-DATE
> > :samza-serializers_2.10:compileScala UP-TO-DATE
> > :samza-serializers_2.10:processResources UP-TO-DATE
> > :samza-serializers_2.10:classes UP-TO-DATE
> > :samza-serializers_2.10:jar UP-TO-DATE
> > :samza-kafka_2.10:compileJava UP-TO-DATE
> > :samza-kafka_2.10:compileScala UP-TO-DATE
> > :samza-kafka_2.10:processResources UP-TO-DATE
> > :samza-kafka_2.10:classes UP-TO-DATE
> > :samza-kafka_2.10:jar UP-TO-DATE
> > :samza-yarn_2.10:compileJava UP-TO-DATE
> > :samza-yarn_2.10:compileScala UP-TO-DATE
> > :samza-yarn_2.10:processResources UP-TO-DATE
> > :samza-yarn_2.10:classes UP-TO-DATE
> > :samza-yarn_2.10:jar UP-TO-DATE
> > :samza-shell:checkpointTool
> > 0 [main] INFO org.apache.samza.checkpoint.CheckpointTool$CommandLine - HAIII
> >
> > The INFO line is just a line I added to verify, since no logging is used in CheckpointTool right now.
The gradle task works fine with slf4j-simple, but if you used CheckpointTool via the checkpoint-tool.sh shell script, you wouldn't get any logs, since the slf4j-simple dependency doesn't carry over to the shell script. (Perhaps the logs would be written to a file, depending on how slf4j is configured.)
However, that's really a separate issue from making a checkpointing tool. So I've opened https://issues.apache.org/jira/browse/SAMZA-215 to discuss the logging, and I'm removing this logging config file from this RB.
> On March 28, 2014, 9:44 p.m., Chris Riccomini wrote:
> > samza-core/src/main/scala/org/apache/samza/checkpoint/CheckpointTool.scala, line 47
> > <https://reviews.apache.org/r/19481/diff/3/?file=531254#file531254line47>
> >
> > remove spaces between = here, just to avoid confusion.
Ok.
> On March 28, 2014, 9:44 p.m., Chris Riccomini wrote:
> > samza-core/src/main/scala/org/apache/samza/checkpoint/CheckpointTool.scala, line 62
> > <https://reviews.apache.org/r/19481/diff/3/?file=531254#file531254line62>
> >
> > nit pick: can we call this CheckpointToolCommandLine or something, just to avoid two classes with the same name, but different package spaces?
Ok.
> On March 28, 2014, 9:44 p.m., Chris Riccomini wrote:
> > samza-core/src/main/scala/org/apache/samza/checkpoint/CheckpointTool.scala, line 106
> > <https://reviews.apache.org/r/19481/diff/3/?file=531254#file531254line106>
> >
> > Should do extends Logging.
> >
> > I prefer logging over println here because we will be using this guy programmatically as well as via the CLI (e.g. to embed in a Web UI).
Ok.
> On March 28, 2014, 9:44 p.m., Chris Riccomini wrote:
> > samza-core/src/main/scala/org/apache/samza/checkpoint/CheckpointTool.scala, line 119
> > <https://reviews.apache.org/r/19481/diff/3/?file=531254#file531254line119>
> >
> > Should call manager.register for each partition before calling start. This is the agreement that CheckpointManager defines.
Ok.
- Martin
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19481/#review38937
-----------------------------------------------------------
On March 21, 2014, 3:14 p.m., Martin Kleppmann wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19481/
> -----------------------------------------------------------
>
> (Updated March 21, 2014, 3:14 p.m.)
>
>
> Review request for samza.
>
>
> Repository: samza
>
>
> Description
> -------
>
> Make CheckpointTool non-Kafka-specific; move CommandLine to util; move Gradle task to kafka-shell
>
>
> SAMZA-180: Command-line tool for manipulating checkpoints
>
>
> Diffs
> -----
>
> build.gradle 8e369b83b7c4a658e1a3660efc92a24efadc9fc1
> samza-core/src/main/resources/samza-cmdline-log4j.properties PRE-CREATION
> samza-core/src/main/scala/org/apache/samza/checkpoint/CheckpointTool.scala PRE-CREATION
> samza-core/src/main/scala/org/apache/samza/job/JobRunner.scala f3a75afa96a8dc64c98f37fdb88c63075ac2374b
> samza-core/src/main/scala/org/apache/samza/util/CommandLine.scala PRE-CREATION
> samza-kafka/src/main/scala/org/apache/samza/checkpoint/kafka/KafkaCheckpointManager.scala 27b38b25dc6c34f3ef76d400370d1c857834e6a2
> samza-shell/src/main/bash/checkpoint-tool.sh PRE-CREATION
>
> Diff: https://reviews.apache.org/r/19481/diff/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Martin Kleppmann
>
>