You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by mjtieman <gi...@git.apache.org> on 2015/03/03 20:35:49 UTC
[GitHub] storm pull request: Storm 697: Support for Emitting Kafka Message ...
GitHub user mjtieman opened a pull request:
https://github.com/apache/storm/pull/454
Storm 697: Support for Emitting Kafka Message Offset and Partition
It would be nice expose the offset and partition of messages consumed from Kafka to the Scheme generating the Tuples. This is useful for auditing/replaying data from arbitrary points on a Kafka topic, saving the partition and offset of each message of a discrete stream instead of persisting the entire message.
* Added new sheme to that accepts a Partition and the message offset in its deserialization method.
* Defined an overload of `KafkaUtils.generateTuples` to accept a `Partition` and offset in addition to the message byte[].
* Added a flag in `KafkaConfig` to indicate if the metadata, partition and offset, should be available during tuple generation.
* Wrote a simple String implementation of the new scheme, `StringMessageAndMetadataScheme`, following the same pattern as `StringKeyValueScheme`.
* Unit tests.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/mjtieman/storm STORM-697
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/storm/pull/454.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #454
----
commit 5b4c28a088ffc62ebcc28e8c28a25d096aa1eb78
Author: matt.tieman <ma...@inin.com>
Date: 2015-03-03T16:46:30Z
STORM-697: Added tupleMetaData flag
commit 6e4fde20af8d285cdf4829e4c2c4aef4cd45d89d
Author: matt.tieman <ma...@inin.com>
Date: 2015-03-03T16:47:38Z
STORM-697: Overload of generateTuples to accept the Partition and offset
commit 6e768665320d08815c53f27e706ef2ae1ff5af78
Author: matt.tieman <ma...@inin.com>
Date: 2015-03-03T16:48:57Z
STORM-697: Test for MessageMetadataSchemeAsMultiScheme and generateTuples with metadata using SchemeAsMultiScheme
commit 2f119c6e2edace030afeb9ee0885010f1de7fc28
Author: matt.tieman <ma...@inin.com>
Date: 2015-03-03T16:50:04Z
STORM-697: Added scheme to include Partition and offset when generating tuple. >>>
The MessageMetadataScheme interface extends Sheme and defines a deserialization method that accepts the message byte[], Partition, and the offset.
MessageMetadataSchemeAsMultiScheme follows the same pattern as KeyValueSchemeAsMultiScheme, extending SchemeAsMultiScheme and providing a deserialization method named for the method defined by
MessageMetadataScheme.
StringMessageAndMetadataScheme provides an implementation of MessageMetadataScheme, following the same pattern as StringKeyValueScheme.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] storm pull request: Storm-697: Support for Emitting Kafka Message ...
Posted by maverick2202 <gi...@git.apache.org>.
Github user maverick2202 commented on the pull request:
https://github.com/apache/storm/pull/454#issuecomment-145079299
Can we merge this code into 0.10 release ?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] storm pull request: Storm-697: Support for Emitting Kafka Message ...
Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the pull request:
https://github.com/apache/storm/pull/454#issuecomment-150367335
@mjtieman
Sorry for the long delay I didn't catch this before.
One minor thing, seems like we don't need tupleMetaData flag.
We can know what information such user want to take, by only checking scheme type, and this PR already does it.
When it doesn't use StringMessageAndMetadataScheme, tupleMetaData flag is no effect. I think it shows flag is not needed.
What do you think?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] storm pull request: Storm-697: Support for Emitting Kafka Message ...
Posted by mjtieman <gi...@git.apache.org>.
Github user mjtieman commented on the pull request:
https://github.com/apache/storm/pull/454#issuecomment-145277282
Conflicts resolved.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] storm pull request: Storm-697: Support for Emitting Kafka Message ...
Posted by mjtieman <gi...@git.apache.org>.
Github user mjtieman commented on the pull request:
https://github.com/apache/storm/pull/454#issuecomment-151358142
Updated the storm-kafka to list MessageMetadataSchemeAsMultiScheme as a MultiScheme implementation. Let me know if there is anything else I should add.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] storm pull request: Storm-697: Support for Emitting Kafka Message ...
Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on a diff in the pull request:
https://github.com/apache/storm/pull/454#discussion_r43234876
--- Diff: external/storm-kafka/src/jvm/storm/kafka/StringMessageAndMetadataScheme.java ---
@@ -0,0 +1,25 @@
+package storm.kafka;
--- End diff --
License header should be put to the first line of the file.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] storm pull request: Storm-697: Support for Emitting Kafka Message ...
Posted by mjtieman <gi...@git.apache.org>.
Github user mjtieman commented on the pull request:
https://github.com/apache/storm/pull/454#issuecomment-150746246
@HeartSaVioR No worries. I agree that the tupleMetaData flag is superfluous. The instanceof check that is being done to guarantee type safety can be used to determine which path to take when generating the tuples.
This will require a little bit of tweaking since we will be moving the scheme instanceof check and cast out of KafkaUtils and into PartitionManager. Updating now.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] storm pull request: Storm-697: Support for Emitting Kafka Message ...
Posted by lbruand-poctu <gi...@git.apache.org>.
Github user lbruand-poctu commented on the pull request:
https://github.com/apache/storm/pull/454#issuecomment-92336938
:+1:
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] storm pull request: Storm-697: Support for Emitting Kafka Message ...
Posted by msaunier-poctu <gi...@git.apache.org>.
Github user msaunier-poctu commented on the pull request:
https://github.com/apache/storm/pull/454#issuecomment-90546074
:+1:
TridentKafkaEmitter should be updated to support emitting offset in Trident :
```diff
diff --git a/external/storm-kafka/src/jvm/storm/kafka/trident/TridentKafkaEmitter.java b/external/storm-kafka/src/jvm/storm/kafka/trident/TridentKafkaEmitter.java
index 94bf134..dc9bb6d 100644
--- a/external/storm-kafka/src/jvm/storm/kafka/trident/TridentKafkaEmitter.java
+++ b/external/storm-kafka/src/jvm/storm/kafka/trident/TridentKafkaEmitter.java
@@ -113,7 +113,7 @@ public class TridentKafkaEmitter {
ByteBufferMessageSet msgs = fetchMessages(consumer, partition, offset);
long endoffset = offset;
for (MessageAndOffset msg : msgs) {
- emit(collector, msg.message());
+ emit(collector, msg.message(),partition,msg.offset());
endoffset = msg.nextOffset();
}
Map newMeta = new HashMap();
@@ -160,14 +160,19 @@ public class TridentKafkaEmitter {
if (offset > nextOffset) {
throw new RuntimeException("Error when re-emitting batch. overshot the end offset");
}
- emit(collector, msg.message());
+ emit(collector, msg.message(),partition,msg.offset());
offset = msg.nextOffset();
}
}
}
- private void emit(TridentCollector collector, Message msg) {
- Iterable<List<Object>> values = KafkaUtils.generateTuples(_config, msg);
+ private void emit(TridentCollector collector, Message msg, Partition partition, long offset) {
+ Iterable<List<Object>> values;
+ if(_config.tupleMetaData) {
+ values = KafkaUtils.generateTuples(_config, msg, partition, offset);
+ }else{
+ values = KafkaUtils.generateTuples(_config, msg);
+ }
if (values != null) {
for (List<Object> value : values) {
collector.emit(value);
```
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] storm pull request: Storm-697: Support for Emitting Kafka Message ...
Posted by mjtieman <gi...@git.apache.org>.
Github user mjtieman commented on the pull request:
https://github.com/apache/storm/pull/454#issuecomment-151139962
Will do. I will also update the CHANGELOG and contributors section of the README.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] storm pull request: Storm-697: Support for Emitting Kafka Message ...
Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on a diff in the pull request:
https://github.com/apache/storm/pull/454#discussion_r43234861
--- Diff: external/storm-kafka/src/jvm/storm/kafka/MessageMetadataScheme.java ---
@@ -0,0 +1,25 @@
+package storm.kafka;
+
+import java.util.List;
+import backtype.storm.spout.Scheme;
+
+/**
--- End diff --
License header should be put to the first line of the file.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] storm pull request: Storm-697: Support for Emitting Kafka Message ...
Posted by mjtieman <gi...@git.apache.org>.
Github user mjtieman commented on the pull request:
https://github.com/apache/storm/pull/454#issuecomment-102558140
Again apologies, I have never contributed code to an open source project before. Clearly I did not understand the distinction between a contributor (me) and a committer. Is there a formal way to request a pull request be merged by a committer?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] storm pull request: Storm-697: Support for Emitting Kafka Message ...
Posted by mjtieman <gi...@git.apache.org>.
Github user mjtieman commented on the pull request:
https://github.com/apache/storm/pull/454#issuecomment-101387750
Apologies, I have to register as an Apache committer before I can merge these changes.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] storm pull request: Storm-697: Support for Emitting Kafka Message ...
Posted by mjtieman <gi...@git.apache.org>.
Github user mjtieman commented on the pull request:
https://github.com/apache/storm/pull/454#issuecomment-145112813
This branch has fallen a little behind since the changes have been made. I will resolve the conflicts this evening.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] storm pull request: Storm-697: Support for Emitting Kafka Message ...
Posted by mjtieman <gi...@git.apache.org>.
Github user mjtieman commented on the pull request:
https://github.com/apache/storm/pull/454#issuecomment-101367295
I have never merged into storm before, reading on how to perform the merge.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] storm pull request: Storm-697: Support for Emitting Kafka Message ...
Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the pull request:
https://github.com/apache/storm/pull/454#issuecomment-151785449
I'd like to see explanation of MessageMetadataSchemeAsMultiScheme to README.md, since end users would like to see only README.md and play with it.
And I've found that some files' license header are missing or wrongly placed.
Could you address it? Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] storm pull request: Storm-697: Support for Emitting Kafka Message ...
Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on a diff in the pull request:
https://github.com/apache/storm/pull/454#discussion_r43234867
--- Diff: external/storm-kafka/src/jvm/storm/kafka/MessageMetadataSchemeAsMultiScheme.java ---
@@ -0,0 +1,23 @@
+package storm.kafka;
--- End diff --
License header should be put to the first line of the file.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] storm pull request: Storm-697: Support for Emitting Kafka Message ...
Posted by mjtieman <gi...@git.apache.org>.
Github user mjtieman commented on the pull request:
https://github.com/apache/storm/pull/454#issuecomment-151841370
Done.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] storm pull request: Storm-697: Support for Emitting Kafka Message ...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/storm/pull/454
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] storm pull request: Storm-697: Support for Emitting Kafka Message ...
Posted by maverick2202 <gi...@git.apache.org>.
Github user maverick2202 commented on the pull request:
https://github.com/apache/storm/pull/454#issuecomment-145132346
thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] storm pull request: Storm-697: Support for Emitting Kafka Message ...
Posted by mjtieman <gi...@git.apache.org>.
Github user mjtieman commented on the pull request:
https://github.com/apache/storm/pull/454#issuecomment-149210877
@HeartSaVioR Mind taking a look at these changes to determine if they should be merged?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] storm pull request: Storm-697: Support for Emitting Kafka Message ...
Posted by mjtieman <gi...@git.apache.org>.
Github user mjtieman commented on the pull request:
https://github.com/apache/storm/pull/454#issuecomment-150749079
Done.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] storm pull request: Storm-697: Support for Emitting Kafka Message ...
Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the pull request:
https://github.com/apache/storm/pull/454#issuecomment-150995317
+1
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] storm pull request: Storm-697: Support for Emitting Kafka Message ...
Posted by lbruand-poctu <gi...@git.apache.org>.
Github user lbruand-poctu commented on the pull request:
https://github.com/apache/storm/pull/454#issuecomment-101181445
Hi!
What is in the way of the merging of these changes ?
When can we expect them to be merged ?
Cheers,
Lucas
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] storm pull request: Storm-697: Support for Emitting Kafka Message ...
Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the pull request:
https://github.com/apache/storm/pull/454#issuecomment-102562254
@mjtieman
No, you're going the right way.
You can refer http://storm.apache.org/documentation/BYLAWS.html to how project works.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] storm pull request: Storm-697: Support for Emitting Kafka Message ...
Posted by mjtieman <gi...@git.apache.org>.
Github user mjtieman commented on the pull request:
https://github.com/apache/storm/pull/454#issuecomment-91916164
Good call, I will update TridentKafkaEmitter as well.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] storm pull request: Storm-697: Support for Emitting Kafka Message ...
Posted by mjtieman <gi...@git.apache.org>.
Github user mjtieman commented on the pull request:
https://github.com/apache/storm/pull/454#issuecomment-151831272
Will do.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] storm pull request: Storm-697: Support for Emitting Kafka Message ...
Posted by mjtieman <gi...@git.apache.org>.
Github user mjtieman commented on the pull request:
https://github.com/apache/storm/pull/454#issuecomment-96870673
If there are no other comments on these changes let's get them merged :smile: .
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] storm pull request: Storm-697: Support for Emitting Kafka Message ...
Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the pull request:
https://github.com/apache/storm/pull/454#issuecomment-151016795
@mjtieman
Could you add this feature to external/storm-kafka/README.md so that users can see it? Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] storm pull request: Storm-697: Support for Emitting Kafka Message ...
Posted by mjtieman <gi...@git.apache.org>.
Github user mjtieman commented on the pull request:
https://github.com/apache/storm/pull/454#issuecomment-91930393
Done.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] storm pull request: Storm-697: Support for Emitting Kafka Message ...
Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the pull request:
https://github.com/apache/storm/pull/454#issuecomment-150367444
Otherwise looks good overall.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] storm pull request: Storm-697: Support for Emitting Kafka Message ...
Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the pull request:
https://github.com/apache/storm/pull/454#issuecomment-151848210
+1
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] storm pull request: Storm-697: Support for Emitting Kafka Message ...
Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the pull request:
https://github.com/apache/storm/pull/454#issuecomment-151141559
You don't need to update CHANGELOG / README since it could increase probability of confliction.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] storm pull request: Storm-697: Support for Emitting Kafka Message ...
Posted by mjtieman <gi...@git.apache.org>.
Github user mjtieman commented on the pull request:
https://github.com/apache/storm/pull/454#issuecomment-148212308
Resolved conflicts in PartitionManager. Let me know if there is anything I can fix that is blocking these changes from being mergd.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---