You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Roger Hoover (JIRA)" <ji...@apache.org> on 2014/09/10 00:45:29 UTC
[jira] [Commented] (SAMZA-40) Refactor Samza configuration
[ https://issues.apache.org/jira/browse/SAMZA-40?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14127704#comment-14127704 ]
Roger Hoover commented on SAMZA-40:
-----------------------------------
Here are some things that come to mind for me but I haven't really though through:
- What about a way to specify a DAG for the job? From the developer's point of view, she mostly cares of the data flow. Maybe there could a pluggable naming schema for topics in between jobs so that you don't have to explicitly name them??? You'd want a nice way to specify this. YAML?? Using job-name:
wikipedia-feed
- wikipedia-parser
- wikipedia-stats
Ideally, that would be enough to wire everything together???
- Support a programatic, code-level API for building, validating and deploying jobs? Hopefully, this would make it possible to build higher-level frameworks on top that could dynamically generate jobs. I don't know if I'd ever want to do this but if the API is there, you never know what will spring up.
- Support for validation during build and during runtime initialization to catch errors early.
- Can sensible defaults make the config less verbose?
- What about on/off switches for things like metrics and checkpointing? If don't specify otherwise, you get the default metrics package and Kafka checkpointing.
> Refactor Samza configuration
> ----------------------------
>
> Key: SAMZA-40
> URL: https://issues.apache.org/jira/browse/SAMZA-40
> Project: Samza
> Issue Type: Bug
> Components: container
> Affects Versions: 0.6.0
> Reporter: Chris Riccomini
> Labels: project
>
> Samza's configuration system has several problems that we need to resolved.
> * Want to auto-generate documentation based off of configuration.
> * Should support global defaults for a config property. Right now, we do config.getFoo.getOrElse() everywhere.
> * Should validate config up front, rather than thrown runtime exceptions randomly throughout the code.
> * We are mixing wiring and configuration together. How do other systems handle this?
> * We have fragmented configuration (anybody can define configuration). How do other systems handle this?
> * How to handle undefined configuration? How to make this interoperable with both Java and Scala (i.e. should we support Option in Scala)?
> * Should remain immutable.
> * Should remove implicits. It's just confusing.
> * Do we want to support complex types (list, map) for values, not just String?
> We need a design proposal for this.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)