You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@samza.apache.org by Martin Kleppmann <mk...@linkedin.com> on 2014/04/02 18:19:27 UTC

Re: Review Request 19481: SAMZA-180: Command-line tool for manipulating checkpoints


> On March 28, 2014, 9:44 p.m., Chris Riccomini wrote:
> > Overall, looks really good. Few minor nits, and some build.gradle cleanup.
> > 
> > Also:
> > 
> > 1. Could you add docs to the website on how to use this tool?
> > 2. Could you write some unit tests for CheckpointTool?

Ok, I've added some tests.

Not sure what the docs should look like. Right now you can't use the checkpoint-tool.sh shell script out of the box; and the way you use it in hello-samza depends on the way hello-samza is built. Should the documentation describe how to use the Gradle task? Or should we wait until we have a binary release (where users can just unpack the release and run bin/checkpoint-tool.sh without having to build anything)?


> On March 28, 2014, 9:44 p.m., Chris Riccomini wrote:
> > build.gradle, line 164
> > <https://reviews.apache.org/r/19481/diff/3/?file=531252#file531252line164>
> >
> >     No need to make samza-shell depend on these for runtime. These instructions sound better:
> >     
> >     http://forums.gradle.org/gradle/topics/how_to_use_in_gradle_javaexec_with_classpath_dependency
> >     
> >     In short:
> >     
> >       configurations {
> >         gradleShell
> >       }
> >     
> >       dependencies {
> >         gradleShell project(":samza-core_$scalaVersion")
> >         gradleShell project(":samza-kafka_$scalaVersion")
> >         gradleShell project(":samza-yarn_$scalaVersion")
> >         gradleShell "org.slf4j:slf4j-simple:$slf4jVersion"
> >       }
> >     
> >       ....
> >     
> >       classpath = configurations.gradleShell
> >     
> >     Will need to add slf4jVersion to gradle/dependency-versions.gradle.. also should update the sfl4j testRuntime configuration in samza-kafka to use slf4jVersion.

Ok, that sounds like a good solution. Made that change.


> On March 28, 2014, 9:44 p.m., Chris Riccomini wrote:
> > samza-core/src/main/resources/samza-cmdline-log4j.properties, line 1
> > <https://reviews.apache.org/r/19481/diff/3/?file=531253#file531253line1>
> >
> >     This file isn't needed if you use slf4j-simple as described above. I have verified by removing the file and running with:
> >     
> >     $ ./gradlew checkpointTool
> >     The TaskContainer.add() method has been deprecated and is scheduled to be removed in Gradle 2.0. Please use the create() method instead.
> >     :samza-api:compileJava UP-TO-DATE
> >     :samza-api:processResources UP-TO-DATE
> >     :samza-api:classes UP-TO-DATE
> >     :samza-api:jar UP-TO-DATE
> >     :samza-core_2.10:compileJava UP-TO-DATE
> >     :samza-core_2.10:compileScala UP-TO-DATE
> >     :samza-core_2.10:processResources UP-TO-DATE
> >     :samza-core_2.10:classes UP-TO-DATE
> >     :samza-core_2.10:jar UP-TO-DATE
> >     :samza-serializers_2.10:compileJava UP-TO-DATE
> >     :samza-serializers_2.10:compileScala UP-TO-DATE
> >     :samza-serializers_2.10:processResources UP-TO-DATE
> >     :samza-serializers_2.10:classes UP-TO-DATE
> >     :samza-serializers_2.10:jar UP-TO-DATE
> >     :samza-kafka_2.10:compileJava UP-TO-DATE
> >     :samza-kafka_2.10:compileScala UP-TO-DATE
> >     :samza-kafka_2.10:processResources UP-TO-DATE
> >     :samza-kafka_2.10:classes UP-TO-DATE
> >     :samza-kafka_2.10:jar UP-TO-DATE
> >     :samza-yarn_2.10:compileJava UP-TO-DATE
> >     :samza-yarn_2.10:compileScala UP-TO-DATE
> >     :samza-yarn_2.10:processResources UP-TO-DATE
> >     :samza-yarn_2.10:classes UP-TO-DATE
> >     :samza-yarn_2.10:jar UP-TO-DATE
> >     :samza-shell:checkpointTool
> >     0 [main] INFO org.apache.samza.checkpoint.CheckpointTool$CommandLine - HAIII
> >     
> >     The INFO line is just a line I added to verify, since no logging is used in CheckpointTool right now.

The gradle task works fine with slf4j-simple, but if you used CheckpointTool via the checkpoint-tool.sh shell script, you wouldn't get any logs, since the slf4j-simple dependency doesn't carry over to the shell script. (Perhaps the logs would be written to a file, depending on how slf4j is configured.)

However, that's really a separate issue from making a checkpointing tool. So I've opened https://issues.apache.org/jira/browse/SAMZA-215 to discuss the logging, and I'm removing this logging config file from this RB.


> On March 28, 2014, 9:44 p.m., Chris Riccomini wrote:
> > samza-core/src/main/scala/org/apache/samza/checkpoint/CheckpointTool.scala, line 47
> > <https://reviews.apache.org/r/19481/diff/3/?file=531254#file531254line47>
> >
> >     remove spaces between = here, just to avoid confusion.

Ok.


> On March 28, 2014, 9:44 p.m., Chris Riccomini wrote:
> > samza-core/src/main/scala/org/apache/samza/checkpoint/CheckpointTool.scala, line 62
> > <https://reviews.apache.org/r/19481/diff/3/?file=531254#file531254line62>
> >
> >     nit pick: can we call this CheckpointToolCommandLine or something, just to avoid two classes with the same name, but different package spaces?

Ok.


> On March 28, 2014, 9:44 p.m., Chris Riccomini wrote:
> > samza-core/src/main/scala/org/apache/samza/checkpoint/CheckpointTool.scala, line 106
> > <https://reviews.apache.org/r/19481/diff/3/?file=531254#file531254line106>
> >
> >     Should do extends Logging.
> >     
> >     I prefer logging over println here because we will be using this guy programmatically as well as via the CLI (e.g. to embed in a Web UI).

Ok.


> On March 28, 2014, 9:44 p.m., Chris Riccomini wrote:
> > samza-core/src/main/scala/org/apache/samza/checkpoint/CheckpointTool.scala, line 119
> > <https://reviews.apache.org/r/19481/diff/3/?file=531254#file531254line119>
> >
> >     Should call manager.register for each partition before calling start. This is the agreement that CheckpointManager defines.

Ok.


- Martin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19481/#review38937
-----------------------------------------------------------


On March 21, 2014, 3:14 p.m., Martin Kleppmann wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19481/
> -----------------------------------------------------------
> 
> (Updated March 21, 2014, 3:14 p.m.)
> 
> 
> Review request for samza.
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> Make CheckpointTool non-Kafka-specific; move CommandLine to util; move Gradle task to kafka-shell
> 
> 
> SAMZA-180: Command-line tool for manipulating checkpoints
> 
> 
> Diffs
> -----
> 
>   build.gradle 8e369b83b7c4a658e1a3660efc92a24efadc9fc1 
>   samza-core/src/main/resources/samza-cmdline-log4j.properties PRE-CREATION 
>   samza-core/src/main/scala/org/apache/samza/checkpoint/CheckpointTool.scala PRE-CREATION 
>   samza-core/src/main/scala/org/apache/samza/job/JobRunner.scala f3a75afa96a8dc64c98f37fdb88c63075ac2374b 
>   samza-core/src/main/scala/org/apache/samza/util/CommandLine.scala PRE-CREATION 
>   samza-kafka/src/main/scala/org/apache/samza/checkpoint/kafka/KafkaCheckpointManager.scala 27b38b25dc6c34f3ef76d400370d1c857834e6a2 
>   samza-shell/src/main/bash/checkpoint-tool.sh PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/19481/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Martin Kleppmann
> 
>