You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by "harinirajendran (via GitHub)" <gi...@apache.org> on 2023/05/15 17:54:51 UTC

[GitHub] [druid] harinirajendran opened a new pull request, #14281: OBSDATA-440 Adding SegmentMetadataEvent and publishing them via KafkaSegmentMetadataEmitter

harinirajendran opened a new pull request, #14281:
URL: https://github.com/apache/druid/pull/14281

   ## Details
   Adding the new SegmentMetadataEvent and publishing these segment-related metadata events into Kafka by enhancing the KafkaEmitter
   
   ## Description
   In this PR, we are enhancing KafkaEmitter, to emit metadata about published segments (SegmentMetadataEvent) into a Kafka topic. This segment metadata information that gets published into Kafka, can be used by any other downstream services to query Druid intelligently based on the segments published. There are two formats in which the segment metadata can be published to the kafka topic. By default, it's in json format. If the downstream applications want to consume this event in a backward compatibile format, then we can configure kafkaEmitter to publish this event in protobuf format.
   
   ## Old behavior of Kafka Emitter
   Kafka Emitter always emits metrics and alerts and would emit requests if the config request.topic is configured.
   Configs metric.topic and alert.topic are always mandatory and cannot be null.
   
   ## Current behavior of Kafka Emitter [with backwards compatibility]
   We introduced a new config named event.types which dictates the types of events we want the KafkaEmitter to emit. This config takes in a list of strings and can have one or more from [alerts, metrics, requests and segmentMetadata]. And based on this config, alert.topic, request.topic, metric.topic and segmentMetadata.topic should be configured and not left empty.
   If no event.types is set, then by default, the kafka emitter would emit metrics and alerts. And in that case, to maintain backwards compatibility, decision to send out requests are based on if request.topic is empty or set.
   
   ## Segment Metadata format
   The emitter by default would emit the metadata in json string format. If segment.metadata.format is set to protobuf, then the emitter emits it in protobuf format.
   
   This PR has:
   
   - [ ] been self-reviewed.
      - [ ] using the [concurrency checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md) (Remove this item if the PR doesn't have any relation to concurrency.)
   - [ ] added documentation for new or modified features or behaviors.
   - [ ] a release note entry in the PR description.
   - [ ] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
   - [ ] added or updated version, license, or notice information in [licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md)
   - [ ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
   - [ ] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met.
   - [ ] added integration tests.
   - [ ] been tested in a test Druid cluster.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] harinirajendran commented on a diff in pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "harinirajendran (via GitHub)" <gi...@apache.org>.
harinirajendran commented on code in PR #14281:
URL: https://github.com/apache/druid/pull/14281#discussion_r1200929136


##########
.idea/misc.xml:
##########
@@ -40,6 +40,11 @@
         <option value="$PROJECT_DIR$/pom.xml" />
       </list>
     </option>
+    <option name="ignoredFiles">
+      <set>
+        <option value="$PROJECT_DIR$/integration-tests/pom.xml" />

Review Comment:
   Removing them.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] github-code-scanning[bot] commented on a diff in pull request #14281: OBSDATA-440 Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "github-code-scanning[bot] (via GitHub)" <gi...@apache.org>.
github-code-scanning[bot] commented on code in PR #14281:
URL: https://github.com/apache/druid/pull/14281#discussion_r1194473065


##########
extensions-contrib/kafka-emitter/src/main/java/org/apache/druid/emitter/kafka/KafkaEmitter.java:
##########
@@ -173,38 +195,87 @@
         EventMap map = event.toMap();
         if (config.getClusterName() != null) {
           map = map.asBuilder()
-                   .put("clusterName", config.getClusterName())
-                   .build();
+              .put("clusterName", config.getClusterName())
+              .build();
         }
 
-        String resultJson = jsonMapper.writeValueAsString(map);
-
-        ObjectContainer<String> objectContainer = new ObjectContainer<>(
-            resultJson,
-            StringUtils.toUtf8(resultJson).length
+        byte[] resultBytes = jsonMapper.writeValueAsBytes(map);
+        ObjectContainer<byte[]> objectContainer = new ObjectContainer<>(
+            resultBytes,
+            resultBytes.length
         );
+        Set<EventType> eventTypes = config.getEventTypes();
         if (event instanceof ServiceMetricEvent) {
-          if (!metricQueue.offer(objectContainer)) {
+          if (!eventTypes.contains(EventType.METRICS) || !metricQueue.offer(objectContainer)) {
             metricLost.incrementAndGet();
           }
         } else if (event instanceof AlertEvent) {
-          if (!alertQueue.offer(objectContainer)) {
+          if (!eventTypes.contains(EventType.ALERTS) || !alertQueue.offer(objectContainer)) {
             alertLost.incrementAndGet();
           }
         } else if (event instanceof RequestLogEvent) {
-          if (config.getRequestTopic() == null || !requestQueue.offer(objectContainer)) {
+          if (!eventTypes.contains(EventType.REQUESTS) || !requestQueue.offer(objectContainer)) {
             requestLost.incrementAndGet();
           }
+        } else if (event instanceof SegmentMetadataEvent) {
+          if (!eventTypes.contains(EventType.SEGMENTMETADATA)) {
+            segmentMetadataLost.incrementAndGet();
+          } else {
+            switch (config.getSegmentMetadataTopicFormat()) {
+              case PROTOBUF:
+                resultBytes = convertMetadataEventToProto((SegmentMetadataEvent) event, segmentMetadataLost);
+                objectContainer = new ObjectContainer<>(
+                    resultBytes,
+                    resultBytes.length
+                );
+                break;
+              case JSON:
+                // Do Nothing. We already have the JSON object stored in objectContainer
+                break;
+              default:
+                throw new UnsupportedOperationException("segmentMetadata.topic.format has an invalid value " + config.getSegmentMetadataTopicFormat().toString());
+            }
+            if (!segmentMetadataQueue.offer(objectContainer)) {
+              segmentMetadataLost.incrementAndGet();
+            }
+          }
         } else {
           invalidLost.incrementAndGet();
         }
       }
-      catch (JsonProcessingException e) {
+      catch (Exception e) {
         invalidLost.incrementAndGet();
+        log.warn(e, "Exception while serializing event");
       }
     }
   }
 
+  private byte[] convertMetadataEventToProto(SegmentMetadataEvent event, AtomicLong segmentMetadataLost)

Review Comment:
   ## Useless parameter
   
   The parameter 'segmentMetadataLost' is never used.
   
   [Show more details](https://github.com/apache/druid/security/code-scanning/4964)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ektravel commented on a diff in pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "ektravel (via GitHub)" <gi...@apache.org>.
ektravel commented on code in PR #14281:
URL: https://github.com/apache/druid/pull/14281#discussion_r1197960136


##########
docs/development/extensions-contrib/kafka-emitter.md:
##########
@@ -36,20 +36,28 @@ to monitor the status of your Druid cluster with this extension.
 
 All the configuration parameters for the Kafka emitter are under `druid.emitter.kafka`.
 
-|property|description|required?|default|
-|--------|-----------|---------|-------|
-|`druid.emitter.kafka.bootstrap.servers`|Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)|yes|none|
-|`druid.emitter.kafka.metric.topic`|Kafka topic name for emitter's target to emit service metric.|yes|none|
-|`druid.emitter.kafka.alert.topic`|Kafka topic name for emitter's target to emit alert.|yes|none|
-|`druid.emitter.kafka.request.topic`|Kafka topic name for emitter's target to emit request logs. If left empty then request logs will not be sent to the Kafka topic.|no|none|
-|`druid.emitter.kafka.producer.config`|JSON formatted configuration which user want to set additional properties to Kafka producer.|no|none|
-|`druid.emitter.kafka.clusterName`|Optional value to specify name of your druid cluster. It can help make groups in your monitoring environment. |no|none|
+| property                                           | description                                                                                                                                                                             | required? | default               |
+|----------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|-----------------------|
+| `druid.emitter.kafka.bootstrap.servers`            | Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)                                                                                                                    | yes       | none                  |
+| `druid.emitter.kafka.event.types`                  | Comma-separated event types. <br/>Choices: alerts, metrics, requests, segmentMetadata                                                                                                   | no        | ["metrics", "alerts"] |

Review Comment:
   ```suggestion
   | `druid.emitter.kafka.event.types`                  | Comma-separated event types. <br/>Supported types are `alerts`, `metrics`, `requests`, and `segmentMetadata`.                                                                                                   | no        | `["metrics", "alerts"]` |
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ektravel commented on a diff in pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "ektravel (via GitHub)" <gi...@apache.org>.
ektravel commented on code in PR #14281:
URL: https://github.com/apache/druid/pull/14281#discussion_r1197951159


##########
docs/development/extensions-contrib/kafka-emitter.md:
##########
@@ -36,20 +36,28 @@ to monitor the status of your Druid cluster with this extension.
 
 All the configuration parameters for the Kafka emitter are under `druid.emitter.kafka`.
 
-|property|description|required?|default|
-|--------|-----------|---------|-------|
-|`druid.emitter.kafka.bootstrap.servers`|Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)|yes|none|
-|`druid.emitter.kafka.metric.topic`|Kafka topic name for emitter's target to emit service metric.|yes|none|
-|`druid.emitter.kafka.alert.topic`|Kafka topic name for emitter's target to emit alert.|yes|none|
-|`druid.emitter.kafka.request.topic`|Kafka topic name for emitter's target to emit request logs. If left empty then request logs will not be sent to the Kafka topic.|no|none|
-|`druid.emitter.kafka.producer.config`|JSON formatted configuration which user want to set additional properties to Kafka producer.|no|none|
-|`druid.emitter.kafka.clusterName`|Optional value to specify name of your druid cluster. It can help make groups in your monitoring environment. |no|none|
+| property                                           | description                                                                                                                                                                             | required? | default               |
+|----------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|-----------------------|
+| `druid.emitter.kafka.bootstrap.servers`            | Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)                                                                                                                    | yes       | none                  |

Review Comment:
   ```suggestion
   | `druid.emitter.kafka.bootstrap.servers`            | Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)                                                                                                                    | yes       | none                  |
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] gianm commented on pull request #14281: OBSDATA-440 Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "gianm (via GitHub)" <gi...@apache.org>.
gianm commented on PR #14281:
URL: https://github.com/apache/druid/pull/14281#issuecomment-1548772012

   > Should I explicitly filter this new event type from these emitters?
   
   IMO it would make more sense to edit those emitters to ignore unknown event types.
   
   Btw, this may be OK for your use case, but, I wanted to point out that there is a race here: it's possible for the segments to be committed and for the server to crash before it emits this event. So, some might get missed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ektravel commented on a diff in pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "ektravel (via GitHub)" <gi...@apache.org>.
ektravel commented on code in PR #14281:
URL: https://github.com/apache/druid/pull/14281#discussion_r1197968918


##########
docs/development/extensions-contrib/kafka-emitter.md:
##########
@@ -36,20 +36,28 @@ to monitor the status of your Druid cluster with this extension.
 
 All the configuration parameters for the Kafka emitter are under `druid.emitter.kafka`.
 
-|property|description|required?|default|
-|--------|-----------|---------|-------|
-|`druid.emitter.kafka.bootstrap.servers`|Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)|yes|none|
-|`druid.emitter.kafka.metric.topic`|Kafka topic name for emitter's target to emit service metric.|yes|none|
-|`druid.emitter.kafka.alert.topic`|Kafka topic name for emitter's target to emit alert.|yes|none|
-|`druid.emitter.kafka.request.topic`|Kafka topic name for emitter's target to emit request logs. If left empty then request logs will not be sent to the Kafka topic.|no|none|
-|`druid.emitter.kafka.producer.config`|JSON formatted configuration which user want to set additional properties to Kafka producer.|no|none|
-|`druid.emitter.kafka.clusterName`|Optional value to specify name of your druid cluster. It can help make groups in your monitoring environment. |no|none|
+| property                                           | description                                                                                                                                                                             | required? | default               |
+|----------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|-----------------------|
+| `druid.emitter.kafka.bootstrap.servers`            | Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)                                                                                                                    | yes       | none                  |
+| `druid.emitter.kafka.event.types`                  | Comma-separated event types. <br/>Choices: alerts, metrics, requests, segmentMetadata                                                                                                   | no        | ["metrics", "alerts"] |
+| `druid.emitter.kafka.metric.topic`                 | Kafka topic name for emitter's target to emit service metric. If `event.types` contains `metrics`, this field cannot be left empty                                                      | no        | none                  |
+| `druid.emitter.kafka.alert.topic`                  | Kafka topic name for emitter's target to emit alert. If `event.types` contains `alerts`, this field cannot be left empty                                                                | no        | none                  |
+| `druid.emitter.kafka.request.topic`                | Kafka topic name for emitter's target to emit request logs. If `event.types` contains `requests`, this field cannot be left empty                                                       | no        | none                  |
+| `druid.emitter.kafka.segmentMetadata.topic`        | Kafka topic name for emitter's target to emit segments related metadata. If `event.types` contains `segmentMetadata`, this field cannot be left empty                                   | no        | none                  |
+| `druid.emitter.kafka.segmentMetadata.topic.format` | Format in which segment related metadata will be emitted. <br/>Choices: json, protobuf.<br/> If set to `protobuf`, then segment metadata is emitted in `DruidSegmentEvent.proto` format | no        | json                  |
+| `druid.emitter.kafka.producer.config`              | JSON formatted configuration which user want to set additional properties to Kafka producer.                                                                                            | no        | none                  |
+| `druid.emitter.kafka.clusterName`                  | Optional value to specify name of your druid cluster. It can help make groups in your monitoring environment.                                                                           | no        | none                  |

Review Comment:
   ```suggestion
   | `druid.emitter.kafka.clusterName`                  | Optional value to specify the name of your Druid cluster. It can help make groups in your monitoring environment.                                                                           | no        | none                  |
   ```
   Can you expand upon the following statement: "It can help make groups in your monitoring environment?" It is not clear how setting this property can help with group creation. Alternatively, consider removing this sentence completely. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] harinirajendran commented on a diff in pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "harinirajendran (via GitHub)" <gi...@apache.org>.
harinirajendran commented on code in PR #14281:
URL: https://github.com/apache/druid/pull/14281#discussion_r1211976568


##########
extensions-contrib/kafka-emitter/src/main/java/org/apache/druid/emitter/kafka/KafkaEmitter.java:
##########
@@ -183,24 +203,31 @@ public void emit(final Event event)
             resultJson,
             StringUtils.toUtf8(resultJson).length
         );
+
+        Set<EventType> eventTypes = config.getEventTypes();
         if (event instanceof ServiceMetricEvent) {
-          if (!metricQueue.offer(objectContainer)) {
+          if (!eventTypes.contains(EventType.METRICS) || !metricQueue.offer(objectContainer)) {

Review Comment:
   I was following the same strategy that was followed for request events earlier
   ```
             if (config.getRequestTopic() == null || !requestQueue.offer(objectContainer)) {
               requestLost.incrementAndGet();
             }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 commented on a diff in pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "abhishekagarwal87 (via GitHub)" <gi...@apache.org>.
abhishekagarwal87 commented on code in PR #14281:
URL: https://github.com/apache/druid/pull/14281#discussion_r1213210836


##########
extensions-contrib/kafka-emitter/src/main/java/org/apache/druid/emitter/kafka/KafkaEmitter.java:
##########
@@ -183,24 +203,31 @@ public void emit(final Event event)
             resultJson,
             StringUtils.toUtf8(resultJson).length
         );
+
+        Set<EventType> eventTypes = config.getEventTypes();
         if (event instanceof ServiceMetricEvent) {
-          if (!metricQueue.offer(objectContainer)) {
+          if (!eventTypes.contains(EventType.METRICS) || !metricQueue.offer(objectContainer)) {

Review Comment:
   Got it. gotta live with the sins of the past. 



##########
extensions-contrib/kafka-emitter/src/main/java/org/apache/druid/emitter/kafka/KafkaEmitter.java:
##########
@@ -183,24 +203,31 @@ public void emit(final Event event)
             resultJson,
             StringUtils.toUtf8(resultJson).length
         );
+
+        Set<EventType> eventTypes = config.getEventTypes();
         if (event instanceof ServiceMetricEvent) {
-          if (!metricQueue.offer(objectContainer)) {
+          if (!eventTypes.contains(EventType.METRICS) || !metricQueue.offer(objectContainer)) {

Review Comment:
   Got it. gotta live with the sins of the past. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ektravel commented on a diff in pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "ektravel (via GitHub)" <gi...@apache.org>.
ektravel commented on code in PR #14281:
URL: https://github.com/apache/druid/pull/14281#discussion_r1202491677


##########
docs/development/extensions-contrib/kafka-emitter.md:
##########
@@ -36,20 +36,26 @@ to monitor the status of your Druid cluster with this extension.
 
 All the configuration parameters for the Kafka emitter are under `druid.emitter.kafka`.
 
-|property|description|required?|default|
-|--------|-----------|---------|-------|
-|`druid.emitter.kafka.bootstrap.servers`|Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)|yes|none|
-|`druid.emitter.kafka.metric.topic`|Kafka topic name for emitter's target to emit service metric.|yes|none|
-|`druid.emitter.kafka.alert.topic`|Kafka topic name for emitter's target to emit alert.|yes|none|
-|`druid.emitter.kafka.request.topic`|Kafka topic name for emitter's target to emit request logs. If left empty then request logs will not be sent to the Kafka topic.|no|none|
-|`druid.emitter.kafka.producer.config`|JSON formatted configuration which user want to set additional properties to Kafka producer.|no|none|
-|`druid.emitter.kafka.clusterName`|Optional value to specify name of your druid cluster. It can help make groups in your monitoring environment. |no|none|
+| Property                                           | Description                                                                                                                                                                             | Required | Default               |
+|----------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|-----------------------|
+| `druid.emitter.kafka.bootstrap.servers`            | Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)                                                                                                                    | yes       | none                  |
+| `druid.emitter.kafka.event.types`                  | Comma-separated event types. <br/>Supported types are `alerts`, `metrics`, `requests`, and `segmentMetadata`.                                                                                                   | no        | `["metrics", "alerts"]` |
+| `druid.emitter.kafka.metric.topic`                 | Kafka topic name for emitter's target to emit service metric. If `event.types` contains `metrics`, this field cannot be empty.                                                     | no        | none                  |

Review Comment:
   ```suggestion
   | `druid.emitter.kafka.metric.topic`                 | Kafka topic name for emitter's target to emit service metrics. If `event.types` contains `metrics`, this field cannot be empty.                                                     | no        | none                  |
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekrb19 commented on a diff in pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "abhishekrb19 (via GitHub)" <gi...@apache.org>.
abhishekrb19 commented on code in PR #14281:
URL: https://github.com/apache/druid/pull/14281#discussion_r1201229476


##########
extensions-contrib/kafka-emitter/src/main/java/org/apache/druid/emitter/kafka/KafkaEmitter.java:
##########
@@ -183,24 +202,31 @@ public void emit(final Event event)
             resultJson,
             StringUtils.toUtf8(resultJson).length
         );
+
+        Set<EventType> eventTypes = config.getEventTypes();
         if (event instanceof ServiceMetricEvent) {
-          if (!metricQueue.offer(objectContainer)) {
+          if (!eventTypes.contains(EventType.METRICS) || !metricQueue.offer(objectContainer)) {
             metricLost.incrementAndGet();
           }
         } else if (event instanceof AlertEvent) {
-          if (!alertQueue.offer(objectContainer)) {
+          if (!eventTypes.contains(EventType.ALERTS) || !alertQueue.offer(objectContainer)) {
             alertLost.incrementAndGet();
           }
         } else if (event instanceof RequestLogEvent) {
-          if (config.getRequestTopic() == null || !requestQueue.offer(objectContainer)) {
+          if (!eventTypes.contains(EventType.REQUESTS) || !requestQueue.offer(objectContainer)) {
             requestLost.incrementAndGet();
           }
+        } else if (event instanceof SegmentMetadataEvent) {
+          if (!eventTypes.contains(EventType.SEGMENTMETADATA) || !segmentMetadataQueue.offer(objectContainer)) {
+            segmentMetadataLost.incrementAndGet();
+          }
         } else {
           invalidLost.incrementAndGet();
         }
       }
-      catch (JsonProcessingException e) {

Review Comment:
   Do we need this wider catch block or should `JsonProcessingException` suffice?



##########
extensions-contrib/kafka-emitter/src/test/java/org/apache/druid/emitter/kafka/KafkaEmitterTest.java:
##########
@@ -47,20 +51,23 @@
 @RunWith(Parameterized.class)
 public class KafkaEmitterTest
 {
-  @Parameterized.Parameter
+  @Parameterized.Parameter(0)
+  public Set<KafkaEmitterConfig.EventType> eventsType;
+
+  @Parameterized.Parameter(1)
   public String requestTopic;
 
-  @Parameterized.Parameters(name = "{index}: requestTopic - {0}")
+  @Parameterized.Parameters(name = "{index}: eventTypes - {0}, requestTopic - {1}")
   public static Object[] data()
   {
-    return new Object[] {
-        "requests",
-        null
+    return new Object[][] {
+        {new HashSet<>(Arrays.asList(KafkaEmitterConfig.EventType.METRICS, KafkaEmitterConfig.EventType.REQUESTS, KafkaEmitterConfig.EventType.ALERTS, KafkaEmitterConfig.EventType.SEGMENTMETADATA)), "requests"},
+        {new HashSet<>(Arrays.asList(KafkaEmitterConfig.EventType.METRICS, KafkaEmitterConfig.EventType.ALERTS, KafkaEmitterConfig.EventType.SEGMENTMETADATA)), null}
     };
   }
 
   // there is 1 seconds wait in kafka emitter before it starts sending events to broker, set a timeout for 5 seconds

Review Comment:
   Do we see this test fail sometimes without this bump in timeout?
   
   ```suggestion
     // there is 1 seconds wait in kafka emitter before it starts sending events to broker, set a timeout for 10 seconds
   ```



##########
extensions-contrib/kafka-emitter/src/main/java/org/apache/druid/emitter/kafka/KafkaEmitterConfig.java:
##########
@@ -21,53 +21,114 @@
 
 import com.fasterxml.jackson.annotation.JsonCreator;
 import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonValue;
 import com.google.common.base.Preconditions;
 import com.google.common.collect.ImmutableMap;
+import com.google.common.collect.ImmutableSet;
+import org.apache.druid.java.util.common.StringUtils;
 import org.apache.kafka.clients.producer.ProducerConfig;
 
 import javax.annotation.Nullable;
+import java.util.HashSet;
 import java.util.Map;
+import java.util.Set;
 
 public class KafkaEmitterConfig
 {
+  public enum EventType
+  {
+    METRICS,
+    ALERTS,
+    REQUESTS,
+    SEGMENTMETADATA {

Review Comment:
   nit: for readability, I'd suggest `SEGMENT_METADATA`. Also, I think with that we won't need a custom enum `toString()` 



##########
extensions-contrib/kafka-emitter/src/main/java/org/apache/druid/emitter/kafka/KafkaEmitter.java:
##########
@@ -183,24 +202,31 @@ public void emit(final Event event)
             resultJson,
             StringUtils.toUtf8(resultJson).length
         );
+
+        Set<EventType> eventTypes = config.getEventTypes();
         if (event instanceof ServiceMetricEvent) {
-          if (!metricQueue.offer(objectContainer)) {
+          if (!eventTypes.contains(EventType.METRICS) || !metricQueue.offer(objectContainer)) {
             metricLost.incrementAndGet();
           }
         } else if (event instanceof AlertEvent) {
-          if (!alertQueue.offer(objectContainer)) {
+          if (!eventTypes.contains(EventType.ALERTS) || !alertQueue.offer(objectContainer)) {
             alertLost.incrementAndGet();
           }
         } else if (event instanceof RequestLogEvent) {
-          if (config.getRequestTopic() == null || !requestQueue.offer(objectContainer)) {
+          if (!eventTypes.contains(EventType.REQUESTS) || !requestQueue.offer(objectContainer)) {
             requestLost.incrementAndGet();
           }
+        } else if (event instanceof SegmentMetadataEvent) {
+          if (!eventTypes.contains(EventType.SEGMENTMETADATA) || !segmentMetadataQueue.offer(objectContainer)) {
+            segmentMetadataLost.incrementAndGet();
+          }
         } else {
           invalidLost.incrementAndGet();
         }
       }
-      catch (JsonProcessingException e) {
+      catch (Exception e) {
         invalidLost.incrementAndGet();
+        log.warn(e, "Exception while serializing event");

Review Comment:
   Related to the above comment, this log line only applies to `JsonProcessingException`s. I wonder if we should log it as an error and make it general enough instead of "serializing event".



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ektravel commented on a diff in pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "ektravel (via GitHub)" <gi...@apache.org>.
ektravel commented on code in PR #14281:
URL: https://github.com/apache/druid/pull/14281#discussion_r1202494041


##########
docs/development/extensions-contrib/kafka-emitter.md:
##########
@@ -36,20 +36,26 @@ to monitor the status of your Druid cluster with this extension.
 
 All the configuration parameters for the Kafka emitter are under `druid.emitter.kafka`.
 
-|property|description|required?|default|
-|--------|-----------|---------|-------|
-|`druid.emitter.kafka.bootstrap.servers`|Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)|yes|none|
-|`druid.emitter.kafka.metric.topic`|Kafka topic name for emitter's target to emit service metric.|yes|none|
-|`druid.emitter.kafka.alert.topic`|Kafka topic name for emitter's target to emit alert.|yes|none|
-|`druid.emitter.kafka.request.topic`|Kafka topic name for emitter's target to emit request logs. If left empty then request logs will not be sent to the Kafka topic.|no|none|
-|`druid.emitter.kafka.producer.config`|JSON formatted configuration which user want to set additional properties to Kafka producer.|no|none|
-|`druid.emitter.kafka.clusterName`|Optional value to specify name of your druid cluster. It can help make groups in your monitoring environment. |no|none|
+| Property                                           | Description                                                                                                                                                                             | Required | Default               |
+|----------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|-----------------------|
+| `druid.emitter.kafka.bootstrap.servers`            | Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)                                                                                                                    | yes       | none                  |
+| `druid.emitter.kafka.event.types`                  | Comma-separated event types. <br/>Supported types are `alerts`, `metrics`, `requests`, and `segmentMetadata`.                                                                                                   | no        | `["metrics", "alerts"]` |
+| `druid.emitter.kafka.metric.topic`                 | Kafka topic name for emitter's target to emit service metric. If `event.types` contains `metrics`, this field cannot be empty.                                                     | no        | none                  |
+| `druid.emitter.kafka.alert.topic`                  | Kafka topic name for emitter's target to emit alerts. If `event.types` contains `alerts`, this field cannot empty.                                                                | no        | none                  |
+| `druid.emitter.kafka.request.topic`                | Kafka topic name for emitter's target to emit request logs. If `event.types` contains `requests`, this field cannot be empty.                                                       | no        | none                  |
+| `druid.emitter.kafka.segmentMetadata.topic`        | Kafka topic name for emitter's target to emit segments related metadata. If `event.types` contains `segmentMetadata`, this field cannot be empty.                                   | no        | none                  |
+| `druid.emitter.kafka.producer.config`              | JSON formatted configuration to set additional properties to Kafka producer.                                                                                            | no        | none                  |

Review Comment:
   ```suggestion
   | `druid.emitter.kafka.producer.config`              | JSON configuration to set additional properties to Kafka producer.                                                                                            | no        | none                  |
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekrb19 commented on pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "abhishekrb19 (via GitHub)" <gi...@apache.org>.
abhishekrb19 commented on PR #14281:
URL: https://github.com/apache/druid/pull/14281#issuecomment-1560261281

   I see UTs fail because of the enum change `SEGMENT_METADATA`. It appears that we only do upper and lower case conversions in `KafkaEmitterConfig.EventType#fromString` and `#toString` methods, so I think a custom `toString()` on the enum is needed?
   ```
   Error:    Run 4: KafkaEmitterConfigTest.testDeserializeEventTypesWithDifferentCase:108 » ValueInstantiation Cannot construct instance of `org.apache.druid.emitter.kafka.KafkaEmitterConfig$EventType`, problem: No enum constant org.apache.druid.emitter.kafka.KafkaEmitterConfig.EventType.SEGMENTMETADATA
    at [Source: (String)""segmentMetadata""; line: 1, column: 1]
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] harinirajendran commented on a diff in pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "harinirajendran (via GitHub)" <gi...@apache.org>.
harinirajendran commented on code in PR #14281:
URL: https://github.com/apache/druid/pull/14281#discussion_r1212024835


##########
extensions-contrib/kafka-emitter/src/main/java/org/apache/druid/emitter/kafka/KafkaEmitter.java:
##########
@@ -183,24 +203,31 @@ public void emit(final Event event)
             resultJson,
             StringUtils.toUtf8(resultJson).length
         );
+
+        Set<EventType> eventTypes = config.getEventTypes();
         if (event instanceof ServiceMetricEvent) {
-          if (!metricQueue.offer(objectContainer)) {
+          if (!eventTypes.contains(EventType.METRICS) || !metricQueue.offer(objectContainer)) {

Review Comment:
   Reference: https://github.com/apache/druid/blob/24.0.2/extensions-contrib/kafka-emitter/src/main/java/org/apache/druid/emitter/kafka/KafkaEmitter.java#L195



##########
extensions-contrib/kafka-emitter/src/main/java/org/apache/druid/emitter/kafka/KafkaEmitter.java:
##########
@@ -183,24 +203,31 @@ public void emit(final Event event)
             resultJson,
             StringUtils.toUtf8(resultJson).length
         );
+
+        Set<EventType> eventTypes = config.getEventTypes();
         if (event instanceof ServiceMetricEvent) {
-          if (!metricQueue.offer(objectContainer)) {
+          if (!eventTypes.contains(EventType.METRICS) || !metricQueue.offer(objectContainer)) {

Review Comment:
   I was just following the same strategy that was followed for request events earlier
   ```
             if (config.getRequestTopic() == null || !requestQueue.offer(objectContainer)) {
               requestLost.incrementAndGet();
             }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ektravel commented on a diff in pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "ektravel (via GitHub)" <gi...@apache.org>.
ektravel commented on code in PR #14281:
URL: https://github.com/apache/druid/pull/14281#discussion_r1197947936


##########
docs/development/extensions-contrib/kafka-emitter.md:
##########
@@ -36,20 +36,28 @@ to monitor the status of your Druid cluster with this extension.
 
 All the configuration parameters for the Kafka emitter are under `druid.emitter.kafka`.
 
-|property|description|required?|default|
-|--------|-----------|---------|-------|
-|`druid.emitter.kafka.bootstrap.servers`|Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)|yes|none|
-|`druid.emitter.kafka.metric.topic`|Kafka topic name for emitter's target to emit service metric.|yes|none|
-|`druid.emitter.kafka.alert.topic`|Kafka topic name for emitter's target to emit alert.|yes|none|
-|`druid.emitter.kafka.request.topic`|Kafka topic name for emitter's target to emit request logs. If left empty then request logs will not be sent to the Kafka topic.|no|none|
-|`druid.emitter.kafka.producer.config`|JSON formatted configuration which user want to set additional properties to Kafka producer.|no|none|
-|`druid.emitter.kafka.clusterName`|Optional value to specify name of your druid cluster. It can help make groups in your monitoring environment. |no|none|
+| property                                           | description                                                                                                                                                                             | required? | default               |
+|----------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|-----------------------|
+| `druid.emitter.kafka.bootstrap.servers`            | Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)                                                                                                                    | yes       | none                  |
+| `druid.emitter.kafka.event.types`                  | Comma-separated event types. <br/>Choices: alerts, metrics, requests, segmentMetadata                                                                                                   | no        | ["metrics", "alerts"] |
+| `druid.emitter.kafka.metric.topic`                 | Kafka topic name for emitter's target to emit service metric. If `event.types` contains `metrics`, this field cannot be left empty                                                      | no        | none                  |
+| `druid.emitter.kafka.alert.topic`                  | Kafka topic name for emitter's target to emit alert. If `event.types` contains `alerts`, this field cannot be left empty                                                                | no        | none                  |
+| `druid.emitter.kafka.request.topic`                | Kafka topic name for emitter's target to emit request logs. If `event.types` contains `requests`, this field cannot be left empty                                                       | no        | none                  |
+| `druid.emitter.kafka.segmentMetadata.topic`        | Kafka topic name for emitter's target to emit segments related metadata. If `event.types` contains `segmentMetadata`, this field cannot be left empty                                   | no        | none                  |
+| `druid.emitter.kafka.segmentMetadata.topic.format` | Format in which segment related metadata will be emitted. <br/>Choices: json, protobuf.<br/> If set to `protobuf`, then segment metadata is emitted in `DruidSegmentEvent.proto` format | no        | json                  |
+| `druid.emitter.kafka.producer.config`              | JSON formatted configuration which user want to set additional properties to Kafka producer.                                                                                            | no        | none                  |
+| `druid.emitter.kafka.clusterName`                  | Optional value to specify name of your druid cluster. It can help make groups in your monitoring environment.                                                                           | no        | none                  |
 
 ### Example
 
 ```
 druid.emitter.kafka.bootstrap.servers=hostname1:9092,hostname2:9092
-druid.emitter.kafka.metric.topic=druid-metric
+druid.emitter.kafka.event.types=["alerts", "requests", "segmentMetadata"]
 druid.emitter.kafka.alert.topic=druid-alert
+druid.emitter.kafka.request.topic=druid-request-logs
+druid.emitter.kafka.segmentMetadata.topic=druid-segment-metadata
+druid.emitter.kafka.segmentMetadata.topic.format=protobuf 
 druid.emitter.kafka.producer.config={"max.block.ms":10000}
 ```
+Whenever `druid.emitter.kafka.segmentMetadata.topic.format` field is updated, it is recommended to also update  `druid.emitter.kafka.segmentMetadata.topic` to avoid the same topic from getting polluted with different formats of segment metadata.

Review Comment:
   ```suggestion
   When you update `druid.emitter.kafka.segmentMetadata.topic.format`, it is recommended that you also update  `druid.emitter.kafka.segmentMetadata.topic` to avoid the same topic from getting polluted with different formats of segment metadata.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ektravel commented on a diff in pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "ektravel (via GitHub)" <gi...@apache.org>.
ektravel commented on code in PR #14281:
URL: https://github.com/apache/druid/pull/14281#discussion_r1197951159


##########
docs/development/extensions-contrib/kafka-emitter.md:
##########
@@ -36,20 +36,28 @@ to monitor the status of your Druid cluster with this extension.
 
 All the configuration parameters for the Kafka emitter are under `druid.emitter.kafka`.
 
-|property|description|required?|default|
-|--------|-----------|---------|-------|
-|`druid.emitter.kafka.bootstrap.servers`|Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)|yes|none|
-|`druid.emitter.kafka.metric.topic`|Kafka topic name for emitter's target to emit service metric.|yes|none|
-|`druid.emitter.kafka.alert.topic`|Kafka topic name for emitter's target to emit alert.|yes|none|
-|`druid.emitter.kafka.request.topic`|Kafka topic name for emitter's target to emit request logs. If left empty then request logs will not be sent to the Kafka topic.|no|none|
-|`druid.emitter.kafka.producer.config`|JSON formatted configuration which user want to set additional properties to Kafka producer.|no|none|
-|`druid.emitter.kafka.clusterName`|Optional value to specify name of your druid cluster. It can help make groups in your monitoring environment. |no|none|
+| property                                           | description                                                                                                                                                                             | required? | default               |
+|----------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|-----------------------|
+| `druid.emitter.kafka.bootstrap.servers`            | Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)                                                                                                                    | yes       | none                  |

Review Comment:
   ```suggestion
   | `druid.emitter.kafka.bootstrap.servers`            | Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)                                                                                                                    | Yes       | None                  |
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ektravel commented on a diff in pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "ektravel (via GitHub)" <gi...@apache.org>.
ektravel commented on code in PR #14281:
URL: https://github.com/apache/druid/pull/14281#discussion_r1197960993


##########
docs/development/extensions-contrib/kafka-emitter.md:
##########
@@ -36,20 +36,28 @@ to monitor the status of your Druid cluster with this extension.
 
 All the configuration parameters for the Kafka emitter are under `druid.emitter.kafka`.
 
-|property|description|required?|default|
-|--------|-----------|---------|-------|
-|`druid.emitter.kafka.bootstrap.servers`|Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)|yes|none|
-|`druid.emitter.kafka.metric.topic`|Kafka topic name for emitter's target to emit service metric.|yes|none|
-|`druid.emitter.kafka.alert.topic`|Kafka topic name for emitter's target to emit alert.|yes|none|
-|`druid.emitter.kafka.request.topic`|Kafka topic name for emitter's target to emit request logs. If left empty then request logs will not be sent to the Kafka topic.|no|none|
-|`druid.emitter.kafka.producer.config`|JSON formatted configuration which user want to set additional properties to Kafka producer.|no|none|
-|`druid.emitter.kafka.clusterName`|Optional value to specify name of your druid cluster. It can help make groups in your monitoring environment. |no|none|
+| property                                           | description                                                                                                                                                                             | required? | default               |
+|----------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|-----------------------|
+| `druid.emitter.kafka.bootstrap.servers`            | Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)                                                                                                                    | yes       | none                  |
+| `druid.emitter.kafka.event.types`                  | Comma-separated event types. <br/>Choices: alerts, metrics, requests, segmentMetadata                                                                                                   | no        | ["metrics", "alerts"] |
+| `druid.emitter.kafka.metric.topic`                 | Kafka topic name for emitter's target to emit service metric. If `event.types` contains `metrics`, this field cannot be left empty                                                      | no        | none                  |

Review Comment:
   ```suggestion
   | `druid.emitter.kafka.metric.topic`                 | Kafka topic name for emitter's target to emit service metric. If `event.types` contains `metrics`, this field cannot be empty.                                                     | no        | none                  |
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ektravel commented on a diff in pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "ektravel (via GitHub)" <gi...@apache.org>.
ektravel commented on code in PR #14281:
URL: https://github.com/apache/druid/pull/14281#discussion_r1202491344


##########
docs/development/extensions-contrib/kafka-emitter.md:
##########
@@ -36,20 +36,26 @@ to monitor the status of your Druid cluster with this extension.
 
 All the configuration parameters for the Kafka emitter are under `druid.emitter.kafka`.
 
-|property|description|required?|default|
-|--------|-----------|---------|-------|
-|`druid.emitter.kafka.bootstrap.servers`|Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)|yes|none|
-|`druid.emitter.kafka.metric.topic`|Kafka topic name for emitter's target to emit service metric.|yes|none|
-|`druid.emitter.kafka.alert.topic`|Kafka topic name for emitter's target to emit alert.|yes|none|
-|`druid.emitter.kafka.request.topic`|Kafka topic name for emitter's target to emit request logs. If left empty then request logs will not be sent to the Kafka topic.|no|none|
-|`druid.emitter.kafka.producer.config`|JSON formatted configuration which user want to set additional properties to Kafka producer.|no|none|
-|`druid.emitter.kafka.clusterName`|Optional value to specify name of your druid cluster. It can help make groups in your monitoring environment. |no|none|
+| Property                                           | Description                                                                                                                                                                             | Required | Default               |
+|----------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|-----------------------|
+| `druid.emitter.kafka.bootstrap.servers`            | Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)                                                                                                                    | yes       | none                  |
+| `druid.emitter.kafka.event.types`                  | Comma-separated event types. <br/>Supported types are `alerts`, `metrics`, `requests`, and `segmentMetadata`.                                                                                                   | no        | `["metrics", "alerts"]` |
+| `druid.emitter.kafka.metric.topic`                 | Kafka topic name for emitter's target to emit service metric. If `event.types` contains `metrics`, this field cannot be empty.                                                     | no        | none                  |
+| `druid.emitter.kafka.alert.topic`                  | Kafka topic name for emitter's target to emit alerts. If `event.types` contains `alerts`, this field cannot empty.                                                                | no        | none                  |
+| `druid.emitter.kafka.request.topic`                | Kafka topic name for emitter's target to emit request logs. If `event.types` contains `requests`, this field cannot be empty.                                                       | no        | none                  |
+| `druid.emitter.kafka.segmentMetadata.topic`        | Kafka topic name for emitter's target to emit segments related metadata. If `event.types` contains `segmentMetadata`, this field cannot be empty.                                   | no        | none                  |

Review Comment:
   ```suggestion
   | `druid.emitter.kafka.segmentMetadata.topic`        | Kafka topic name for emitter's target to emit segment metadata. If `event.types` contains `segmentMetadata`, this field cannot be empty.                                   | no        | none                  |
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 commented on pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "abhishekagarwal87 (via GitHub)" <gi...@apache.org>.
abhishekagarwal87 commented on PR #14281:
URL: https://github.com/apache/druid/pull/14281#issuecomment-1572155760

   @harinirajendran - We will have to fix it. When you run locally, are you running just one test? I will suggest running the same maven command that github action is running. I think that you will run into the same error then. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 commented on pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "abhishekagarwal87 (via GitHub)" <gi...@apache.org>.
abhishekagarwal87 commented on PR #14281:
URL: https://github.com/apache/druid/pull/14281#issuecomment-1573129380

   @harinirajendran - You should avoid `force-push` in the future as we can't see the diff from the last commit anymore. can you describe the most recent change? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ektravel commented on a diff in pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "ektravel (via GitHub)" <gi...@apache.org>.
ektravel commented on code in PR #14281:
URL: https://github.com/apache/druid/pull/14281#discussion_r1197964262


##########
docs/development/extensions-contrib/kafka-emitter.md:
##########
@@ -36,20 +36,28 @@ to monitor the status of your Druid cluster with this extension.
 
 All the configuration parameters for the Kafka emitter are under `druid.emitter.kafka`.
 
-|property|description|required?|default|
-|--------|-----------|---------|-------|
-|`druid.emitter.kafka.bootstrap.servers`|Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)|yes|none|
-|`druid.emitter.kafka.metric.topic`|Kafka topic name for emitter's target to emit service metric.|yes|none|
-|`druid.emitter.kafka.alert.topic`|Kafka topic name for emitter's target to emit alert.|yes|none|
-|`druid.emitter.kafka.request.topic`|Kafka topic name for emitter's target to emit request logs. If left empty then request logs will not be sent to the Kafka topic.|no|none|
-|`druid.emitter.kafka.producer.config`|JSON formatted configuration which user want to set additional properties to Kafka producer.|no|none|
-|`druid.emitter.kafka.clusterName`|Optional value to specify name of your druid cluster. It can help make groups in your monitoring environment. |no|none|
+| property                                           | description                                                                                                                                                                             | required? | default               |
+|----------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|-----------------------|
+| `druid.emitter.kafka.bootstrap.servers`            | Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)                                                                                                                    | yes       | none                  |
+| `druid.emitter.kafka.event.types`                  | Comma-separated event types. <br/>Choices: alerts, metrics, requests, segmentMetadata                                                                                                   | no        | ["metrics", "alerts"] |
+| `druid.emitter.kafka.metric.topic`                 | Kafka topic name for emitter's target to emit service metric. If `event.types` contains `metrics`, this field cannot be left empty                                                      | no        | none                  |
+| `druid.emitter.kafka.alert.topic`                  | Kafka topic name for emitter's target to emit alert. If `event.types` contains `alerts`, this field cannot be left empty                                                                | no        | none                  |
+| `druid.emitter.kafka.request.topic`                | Kafka topic name for emitter's target to emit request logs. If `event.types` contains `requests`, this field cannot be left empty                                                       | no        | none                  |
+| `druid.emitter.kafka.segmentMetadata.topic`        | Kafka topic name for emitter's target to emit segments related metadata. If `event.types` contains `segmentMetadata`, this field cannot be left empty                                   | no        | none                  |
+| `druid.emitter.kafka.segmentMetadata.topic.format` | Format in which segment related metadata will be emitted. <br/>Choices: json, protobuf.<br/> If set to `protobuf`, then segment metadata is emitted in `DruidSegmentEvent.proto` format | no        | json                  |

Review Comment:
   ```suggestion
   | `druid.emitter.kafka.segmentMetadata.topic.format` | The format in which segment related metadata is emitted. <br/>Supported formats are `json` and `protobuf`.<br/> If set to `protobuf`, then segment metadata is emitted in `DruidSegmentEvent.proto` format. | no        | json                  |
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ektravel commented on a diff in pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "ektravel (via GitHub)" <gi...@apache.org>.
ektravel commented on code in PR #14281:
URL: https://github.com/apache/druid/pull/14281#discussion_r1197965829


##########
docs/development/extensions-contrib/kafka-emitter.md:
##########
@@ -36,20 +36,28 @@ to monitor the status of your Druid cluster with this extension.
 
 All the configuration parameters for the Kafka emitter are under `druid.emitter.kafka`.
 
-|property|description|required?|default|
-|--------|-----------|---------|-------|
-|`druid.emitter.kafka.bootstrap.servers`|Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)|yes|none|
-|`druid.emitter.kafka.metric.topic`|Kafka topic name for emitter's target to emit service metric.|yes|none|
-|`druid.emitter.kafka.alert.topic`|Kafka topic name for emitter's target to emit alert.|yes|none|
-|`druid.emitter.kafka.request.topic`|Kafka topic name for emitter's target to emit request logs. If left empty then request logs will not be sent to the Kafka topic.|no|none|
-|`druid.emitter.kafka.producer.config`|JSON formatted configuration which user want to set additional properties to Kafka producer.|no|none|
-|`druid.emitter.kafka.clusterName`|Optional value to specify name of your druid cluster. It can help make groups in your monitoring environment. |no|none|
+| property                                           | description                                                                                                                                                                             | required? | default               |
+|----------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|-----------------------|
+| `druid.emitter.kafka.bootstrap.servers`            | Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)                                                                                                                    | yes       | none                  |
+| `druid.emitter.kafka.event.types`                  | Comma-separated event types. <br/>Choices: alerts, metrics, requests, segmentMetadata                                                                                                   | no        | ["metrics", "alerts"] |
+| `druid.emitter.kafka.metric.topic`                 | Kafka topic name for emitter's target to emit service metric. If `event.types` contains `metrics`, this field cannot be left empty                                                      | no        | none                  |
+| `druid.emitter.kafka.alert.topic`                  | Kafka topic name for emitter's target to emit alert. If `event.types` contains `alerts`, this field cannot be left empty                                                                | no        | none                  |
+| `druid.emitter.kafka.request.topic`                | Kafka topic name for emitter's target to emit request logs. If `event.types` contains `requests`, this field cannot be left empty                                                       | no        | none                  |
+| `druid.emitter.kafka.segmentMetadata.topic`        | Kafka topic name for emitter's target to emit segments related metadata. If `event.types` contains `segmentMetadata`, this field cannot be left empty                                   | no        | none                  |
+| `druid.emitter.kafka.segmentMetadata.topic.format` | Format in which segment related metadata will be emitted. <br/>Choices: json, protobuf.<br/> If set to `protobuf`, then segment metadata is emitted in `DruidSegmentEvent.proto` format | no        | json                  |
+| `druid.emitter.kafka.producer.config`              | JSON formatted configuration which user want to set additional properties to Kafka producer.                                                                                            | no        | none                  |

Review Comment:
   ```suggestion
   | `druid.emitter.kafka.producer.config`              | JSON formatted configuration to set additional properties to Kafka producer.                                                                                            | no        | none                  |
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ektravel commented on a diff in pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "ektravel (via GitHub)" <gi...@apache.org>.
ektravel commented on code in PR #14281:
URL: https://github.com/apache/druid/pull/14281#discussion_r1197961856


##########
docs/development/extensions-contrib/kafka-emitter.md:
##########
@@ -36,20 +36,28 @@ to monitor the status of your Druid cluster with this extension.
 
 All the configuration parameters for the Kafka emitter are under `druid.emitter.kafka`.
 
-|property|description|required?|default|
-|--------|-----------|---------|-------|
-|`druid.emitter.kafka.bootstrap.servers`|Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)|yes|none|
-|`druid.emitter.kafka.metric.topic`|Kafka topic name for emitter's target to emit service metric.|yes|none|
-|`druid.emitter.kafka.alert.topic`|Kafka topic name for emitter's target to emit alert.|yes|none|
-|`druid.emitter.kafka.request.topic`|Kafka topic name for emitter's target to emit request logs. If left empty then request logs will not be sent to the Kafka topic.|no|none|
-|`druid.emitter.kafka.producer.config`|JSON formatted configuration which user want to set additional properties to Kafka producer.|no|none|
-|`druid.emitter.kafka.clusterName`|Optional value to specify name of your druid cluster. It can help make groups in your monitoring environment. |no|none|
+| property                                           | description                                                                                                                                                                             | required? | default               |
+|----------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|-----------------------|
+| `druid.emitter.kafka.bootstrap.servers`            | Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)                                                                                                                    | yes       | none                  |
+| `druid.emitter.kafka.event.types`                  | Comma-separated event types. <br/>Choices: alerts, metrics, requests, segmentMetadata                                                                                                   | no        | ["metrics", "alerts"] |
+| `druid.emitter.kafka.metric.topic`                 | Kafka topic name for emitter's target to emit service metric. If `event.types` contains `metrics`, this field cannot be left empty                                                      | no        | none                  |
+| `druid.emitter.kafka.alert.topic`                  | Kafka topic name for emitter's target to emit alert. If `event.types` contains `alerts`, this field cannot be left empty                                                                | no        | none                  |
+| `druid.emitter.kafka.request.topic`                | Kafka topic name for emitter's target to emit request logs. If `event.types` contains `requests`, this field cannot be left empty                                                       | no        | none                  |

Review Comment:
   ```suggestion
   | `druid.emitter.kafka.request.topic`                | Kafka topic name for emitter's target to emit request logs. If `event.types` contains `requests`, this field cannot be empty.                                                       | no        | none                  |
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] harinirajendran commented on a diff in pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "harinirajendran (via GitHub)" <gi...@apache.org>.
harinirajendran commented on code in PR #14281:
URL: https://github.com/apache/druid/pull/14281#discussion_r1198341546


##########
docs/development/extensions-contrib/kafka-emitter.md:
##########
@@ -36,20 +36,28 @@ to monitor the status of your Druid cluster with this extension.
 
 All the configuration parameters for the Kafka emitter are under `druid.emitter.kafka`.
 
-|property|description|required?|default|
-|--------|-----------|---------|-------|
-|`druid.emitter.kafka.bootstrap.servers`|Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)|yes|none|
-|`druid.emitter.kafka.metric.topic`|Kafka topic name for emitter's target to emit service metric.|yes|none|
-|`druid.emitter.kafka.alert.topic`|Kafka topic name for emitter's target to emit alert.|yes|none|
-|`druid.emitter.kafka.request.topic`|Kafka topic name for emitter's target to emit request logs. If left empty then request logs will not be sent to the Kafka topic.|no|none|
-|`druid.emitter.kafka.producer.config`|JSON formatted configuration which user want to set additional properties to Kafka producer.|no|none|
-|`druid.emitter.kafka.clusterName`|Optional value to specify name of your druid cluster. It can help make groups in your monitoring environment. |no|none|
+| property                                           | description                                                                                                                                                                             | required? | default               |
+|----------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|-----------------------|
+| `druid.emitter.kafka.bootstrap.servers`            | Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)                                                                                                                    | yes       | none                  |
+| `druid.emitter.kafka.event.types`                  | Comma-separated event types. <br/>Choices: alerts, metrics, requests, segmentMetadata                                                                                                   | no        | ["metrics", "alerts"] |
+| `druid.emitter.kafka.metric.topic`                 | Kafka topic name for emitter's target to emit service metric. If `event.types` contains `metrics`, this field cannot be left empty                                                      | no        | none                  |
+| `druid.emitter.kafka.alert.topic`                  | Kafka topic name for emitter's target to emit alert. If `event.types` contains `alerts`, this field cannot be left empty                                                                | no        | none                  |
+| `druid.emitter.kafka.request.topic`                | Kafka topic name for emitter's target to emit request logs. If `event.types` contains `requests`, this field cannot be left empty                                                       | no        | none                  |
+| `druid.emitter.kafka.segmentMetadata.topic`        | Kafka topic name for emitter's target to emit segments related metadata. If `event.types` contains `segmentMetadata`, this field cannot be left empty                                   | no        | none                  |
+| `druid.emitter.kafka.segmentMetadata.topic.format` | Format in which segment related metadata will be emitted. <br/>Choices: json, protobuf.<br/> If set to `protobuf`, then segment metadata is emitted in `DruidSegmentEvent.proto` format | no        | json                  |
+| `druid.emitter.kafka.producer.config`              | JSON formatted configuration which user want to set additional properties to Kafka producer.                                                                                            | no        | none                  |
+| `druid.emitter.kafka.clusterName`                  | Optional value to specify name of your druid cluster. It can help make groups in your monitoring environment.                                                                           | no        | none                  |

Review Comment:
   We can group the metrics by clustername to differentiate them from one cluster to another in monitoring environments like datadog, new relic, etc.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] harinirajendran commented on pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "harinirajendran (via GitHub)" <gi...@apache.org>.
harinirajendran commented on PR #14281:
URL: https://github.com/apache/druid/pull/14281#issuecomment-1551767893

   > I think for it to be "perfect" the best way to do it would be to emit in the place you emit here, but also have some other process that detects missed emits somehow and fixes them up by redoing the missed emits. This would be a lot more complex of an implementation, however. So I'd only recommend doing that if it seems worth it.
   
   I think for now we can just go with this implement and enhance it to emit missed segments metadata in the future.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekrb19 commented on a diff in pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "abhishekrb19 (via GitHub)" <gi...@apache.org>.
abhishekrb19 commented on code in PR #14281:
URL: https://github.com/apache/druid/pull/14281#discussion_r1198427132


##########
.idea/misc.xml:
##########
@@ -40,6 +40,11 @@
         <option value="$PROJECT_DIR$/pom.xml" />
       </list>
     </option>
+    <option name="ignoredFiles">
+      <set>
+        <option value="$PROJECT_DIR$/integration-tests/pom.xml" />

Review Comment:
   @harinirajendran, it looks like these changes accidentally slipped in?



##########
docs/development/extensions-contrib/kafka-emitter.md:
##########
@@ -36,20 +36,25 @@ to monitor the status of your Druid cluster with this extension.
 
 All the configuration parameters for the Kafka emitter are under `druid.emitter.kafka`.
 
-|property|description|required?|default|
-|--------|-----------|---------|-------|
-|`druid.emitter.kafka.bootstrap.servers`|Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)|yes|none|
-|`druid.emitter.kafka.metric.topic`|Kafka topic name for emitter's target to emit service metric.|yes|none|
-|`druid.emitter.kafka.alert.topic`|Kafka topic name for emitter's target to emit alert.|yes|none|
-|`druid.emitter.kafka.request.topic`|Kafka topic name for emitter's target to emit request logs. If left empty then request logs will not be sent to the Kafka topic.|no|none|
-|`druid.emitter.kafka.producer.config`|JSON formatted configuration which user want to set additional properties to Kafka producer.|no|none|
-|`druid.emitter.kafka.clusterName`|Optional value to specify name of your druid cluster. It can help make groups in your monitoring environment. |no|none|
+| Property                                           | Description                                                                                                                                                                             | Required | Default               |
+|----------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|-----------------------|
+| `druid.emitter.kafka.bootstrap.servers`            | Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)                                                                                                                    | yes       | none                  |
+| `druid.emitter.kafka.event.types`                  | Comma-separated event types. <br/>Supported types are `alerts`, `metrics`, `requests`, and `segmentMetadata`.                                                                                                   | no        | `["metrics", "alerts"]` |
+| `druid.emitter.kafka.metric.topic`                 | Kafka topic name for emitter's target to emit service metric. If `event.types` contains `metrics`, this field cannot be empty.                                                     | no        | none                  |
+| `druid.emitter.kafka.alert.topic`                  | Kafka topic name for emitter's target to emit alerts. If `event.types` contains `alerts`, this field cannot empty.                                                                | no        | none                  |
+| `druid.emitter.kafka.request.topic`                | Kafka topic name for emitter's target to emit request logs. If `event.types` contains `requests`, this field cannot be empty.                                                       | no        | none                  |
+| `druid.emitter.kafka.segmentMetadata.topic`        | Kafka topic name for emitter's target to emit segments related metadata. If `event.types` contains `segmentMetadata`, this field cannot be empty.                                   | no        | none                  |
+| `druid.emitter.kafka.producer.config`              | JSON formatted configuration to set additional properties to Kafka producer.                                                                                            | no        | none                  |
+| `druid.emitter.kafka.clusterName`                  | Optional value to specify the name of your Druid cluster. It can help make groups in your monitoring environment.                                                                           | no        | none                  |
 
 ### Example
 
 ```
 druid.emitter.kafka.bootstrap.servers=hostname1:9092,hostname2:9092
-druid.emitter.kafka.metric.topic=druid-metric
+druid.emitter.kafka.event.types=["alerts", "requests", "segmentMetadata"]

Review Comment:
   We might as well add metrics to the list and keep the metric topic configuration :)



##########
extensions-contrib/kafka-emitter/src/main/java/org/apache/druid/emitter/kafka/KafkaEmitter.java:
##########
@@ -105,7 +111,7 @@ protected Producer<String, String> setKafkaProducer()
       Properties props = new Properties();
       props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, config.getBootstrapServers());
       props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
-      props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
+      props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, ByteArraySerializer.class.getName());

Review Comment:
   I think this would be a breaking change for existing consumers that expect string types.



##########
extensions-contrib/kafka-emitter/src/test/java/org/apache/druid/emitter/kafka/KafkaEmitterTest.java:
##########
@@ -77,19 +84,31 @@ public void testKafkaEmitter() throws InterruptedException
         ).build("service", "host")
     );
 
-    int totalEvents = serviceMetricEvents.size() + alertEvents.size() + requestLogEvents.size();
+    final List<SegmentMetadataEvent> segmentMetadataEvents = ImmutableList.of(
+        new SegmentMetadataEvent(
+            "dummy_datasource",
+            DateTimes.of("2001-01-01T00:00:00.000Z"),

Review Comment:
   👍 



##########
extensions-contrib/kafka-emitter/src/main/java/org/apache/druid/emitter/kafka/KafkaEmitterConfig.java:
##########
@@ -21,53 +21,115 @@
 
 import com.fasterxml.jackson.annotation.JsonCreator;
 import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonValue;
 import com.google.common.base.Preconditions;
 import com.google.common.collect.ImmutableMap;
+import com.google.common.collect.ImmutableSet;
+import org.apache.druid.java.util.common.StringUtils;
 import org.apache.kafka.clients.producer.ProducerConfig;
 
 import javax.annotation.Nullable;
+import java.util.HashSet;
 import java.util.Map;
+import java.util.Set;
 
 public class KafkaEmitterConfig
 {
+  public enum EventType
+  {
+    METRICS,
+    ALERTS,
+    REQUESTS,
+    SEGMENTMETADATA {
+      @Override
+      public String toString()
+      {
+        return "segmentMetadata";
+      }
+    };
+
+    @JsonValue
+    @Override
+    public String toString()
+    {
+      return StringUtils.toLowerCase(this.name());
+    }
 
+    @JsonCreator
+    public static EventType fromString(String name)
+    {
+      return valueOf(StringUtils.toUpperCase(name));
+    }
+  }
+
+  public static final Set<EventType> DEFAULT_EVENT_TYPES = ImmutableSet.of(EventType.ALERTS, EventType.METRICS);
   @JsonProperty(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG)
   private final String bootstrapServers;
-  @JsonProperty("metric.topic")
+  @Nullable @JsonProperty("event.types")
+  private final Set<EventType> eventTypes;
+  @Nullable @JsonProperty("metric.topic")
   private final String metricTopic;
-  @JsonProperty("alert.topic")
+  @Nullable @JsonProperty("alert.topic")
   private final String alertTopic;
   @Nullable @JsonProperty("request.topic")
   private final String requestTopic;
+  @Nullable @JsonProperty("segmentMetadata.topic")
+  private final String segmentMetadataTopic;
   @JsonProperty
   private final String clusterName;
   @JsonProperty("producer.config")
-  private Map<String, String> kafkaProducerConfig;
+  private final Map<String, String> kafkaProducerConfig;
 
   @JsonCreator
   public KafkaEmitterConfig(
       @JsonProperty(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG) String bootstrapServers,
-      @JsonProperty("metric.topic") String metricTopic,
-      @JsonProperty("alert.topic") String alertTopic,
+      @Nullable @JsonProperty("event.types") Set<EventType> eventTypes,
+      @Nullable @JsonProperty("metric.topic") String metricTopic,
+      @Nullable @JsonProperty("alert.topic") String alertTopic,
       @Nullable @JsonProperty("request.topic") String requestTopic,
+      @Nullable @JsonProperty("segmentMetadata.topic") String segmentMetadataTopic,
       @JsonProperty("clusterName") String clusterName,
       @JsonProperty("producer.config") @Nullable Map<String, String> kafkaProducerConfig
   )
   {
     this.bootstrapServers = Preconditions.checkNotNull(bootstrapServers, "bootstrap.servers can not be null");
-    this.metricTopic = Preconditions.checkNotNull(metricTopic, "metric.topic can not be null");
-    this.alertTopic = Preconditions.checkNotNull(alertTopic, "alert.topic can not be null");
-    this.requestTopic = requestTopic;
+    this.eventTypes = validateEventTypes(eventTypes, requestTopic);
+
+    this.metricTopic = this.eventTypes.contains(EventType.METRICS) ? Preconditions.checkNotNull(metricTopic, "metric.topic can not be null") : null;
+    this.alertTopic = this.eventTypes.contains(EventType.ALERTS) ? Preconditions.checkNotNull(alertTopic, "alert.topic can not be null") : null;
+    this.requestTopic = this.eventTypes.contains(EventType.REQUESTS) ? Preconditions.checkNotNull(requestTopic, "request.topic can not be null") : null;
+    this.segmentMetadataTopic = this.eventTypes.contains(EventType.SEGMENTMETADATA) ? Preconditions.checkNotNull(segmentMetadataTopic, "segmentMetadata.topic can not be null") : null;
     this.clusterName = clusterName;
     this.kafkaProducerConfig = kafkaProducerConfig == null ? ImmutableMap.of() : kafkaProducerConfig;
   }
 
+  private Set<EventType> validateEventTypes(Set<EventType> eventTypes, String requestTopic)

Review Comment:
   Should we rename this function since it's not really validating the supplied event types? Or maybe have a default jackson getter for the event types property that bakes in the backwards compatibility logic?



##########
extensions-contrib/kafka-emitter/src/main/java/org/apache/druid/emitter/kafka/KafkaEmitterConfig.java:
##########
@@ -21,53 +21,115 @@
 
 import com.fasterxml.jackson.annotation.JsonCreator;
 import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonValue;
 import com.google.common.base.Preconditions;
 import com.google.common.collect.ImmutableMap;
+import com.google.common.collect.ImmutableSet;
+import org.apache.druid.java.util.common.StringUtils;
 import org.apache.kafka.clients.producer.ProducerConfig;
 
 import javax.annotation.Nullable;
+import java.util.HashSet;
 import java.util.Map;
+import java.util.Set;
 
 public class KafkaEmitterConfig
 {
+  public enum EventType
+  {
+    METRICS,
+    ALERTS,
+    REQUESTS,
+    SEGMENTMETADATA {
+      @Override
+      public String toString()
+      {
+        return "segmentMetadata";
+      }
+    };
+
+    @JsonValue
+    @Override
+    public String toString()
+    {
+      return StringUtils.toLowerCase(this.name());
+    }
 
+    @JsonCreator
+    public static EventType fromString(String name)
+    {
+      return valueOf(StringUtils.toUpperCase(name));
+    }
+  }
+
+  public static final Set<EventType> DEFAULT_EVENT_TYPES = ImmutableSet.of(EventType.ALERTS, EventType.METRICS);
   @JsonProperty(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG)
   private final String bootstrapServers;
-  @JsonProperty("metric.topic")
+  @Nullable @JsonProperty("event.types")
+  private final Set<EventType> eventTypes;
+  @Nullable @JsonProperty("metric.topic")
   private final String metricTopic;
-  @JsonProperty("alert.topic")
+  @Nullable @JsonProperty("alert.topic")
   private final String alertTopic;
   @Nullable @JsonProperty("request.topic")
   private final String requestTopic;
+  @Nullable @JsonProperty("segmentMetadata.topic")
+  private final String segmentMetadataTopic;
   @JsonProperty
   private final String clusterName;
   @JsonProperty("producer.config")
-  private Map<String, String> kafkaProducerConfig;
+  private final Map<String, String> kafkaProducerConfig;
 
   @JsonCreator
   public KafkaEmitterConfig(
       @JsonProperty(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG) String bootstrapServers,
-      @JsonProperty("metric.topic") String metricTopic,
-      @JsonProperty("alert.topic") String alertTopic,
+      @Nullable @JsonProperty("event.types") Set<EventType> eventTypes,
+      @Nullable @JsonProperty("metric.topic") String metricTopic,
+      @Nullable @JsonProperty("alert.topic") String alertTopic,
       @Nullable @JsonProperty("request.topic") String requestTopic,
+      @Nullable @JsonProperty("segmentMetadata.topic") String segmentMetadataTopic,
       @JsonProperty("clusterName") String clusterName,
       @JsonProperty("producer.config") @Nullable Map<String, String> kafkaProducerConfig
   )
   {
     this.bootstrapServers = Preconditions.checkNotNull(bootstrapServers, "bootstrap.servers can not be null");
-    this.metricTopic = Preconditions.checkNotNull(metricTopic, "metric.topic can not be null");
-    this.alertTopic = Preconditions.checkNotNull(alertTopic, "alert.topic can not be null");
-    this.requestTopic = requestTopic;
+    this.eventTypes = validateEventTypes(eventTypes, requestTopic);
+

Review Comment:
   ```suggestion
   
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] harinirajendran commented on pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "harinirajendran (via GitHub)" <gi...@apache.org>.
harinirajendran commented on PR #14281:
URL: https://github.com/apache/druid/pull/14281#issuecomment-1563485918

   > I see UTs fail because of the enum change `SEGMENT_METADATA`. It appears that we only do upper and lower case conversions in `KafkaEmitterConfig.EventType#fromString` and `#toString` methods, so I think a custom `toString()` on the enum is needed?
   > 
   > ```
   > Error:    Run 4: KafkaEmitterConfigTest.testDeserializeEventTypesWithDifferentCase:108 » ValueInstantiation Cannot construct instance of `org.apache.druid.emitter.kafka.KafkaEmitterConfig$EventType`, problem: No enum constant org.apache.druid.emitter.kafka.KafkaEmitterConfig.EventType.SEGMENTMETADATA
   >  at [Source: (String)""segmentMetadata""; line: 1, column: 1]
   > ```
   
   I updated the event name from SEGMENTMETADATA to SEGMENT_METADATA but I missed to update the unit test I added. Hence the failure. Fixed it now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] harinirajendran commented on pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "harinirajendran (via GitHub)" <gi...@apache.org>.
harinirajendran commented on PR #14281:
URL: https://github.com/apache/druid/pull/14281#issuecomment-1550210127

   > Btw, this may be OK for your use case, but, I wanted to point out that there is a race here: it's possible for the segments to be committed and for the server to crash before it emits this event. So, some might get missed.
   
   That's a great point @gianm. Is there a better place to emit this segment metadata event instead of this place to prevent this?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] harinirajendran commented on a diff in pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "harinirajendran (via GitHub)" <gi...@apache.org>.
harinirajendran commented on code in PR #14281:
URL: https://github.com/apache/druid/pull/14281#discussion_r1198295004


##########
docs/development/extensions-contrib/kafka-emitter.md:
##########
@@ -36,20 +36,28 @@ to monitor the status of your Druid cluster with this extension.
 
 All the configuration parameters for the Kafka emitter are under `druid.emitter.kafka`.
 
-|property|description|required?|default|
-|--------|-----------|---------|-------|
-|`druid.emitter.kafka.bootstrap.servers`|Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)|yes|none|
-|`druid.emitter.kafka.metric.topic`|Kafka topic name for emitter's target to emit service metric.|yes|none|
-|`druid.emitter.kafka.alert.topic`|Kafka topic name for emitter's target to emit alert.|yes|none|
-|`druid.emitter.kafka.request.topic`|Kafka topic name for emitter's target to emit request logs. If left empty then request logs will not be sent to the Kafka topic.|no|none|
-|`druid.emitter.kafka.producer.config`|JSON formatted configuration which user want to set additional properties to Kafka producer.|no|none|
-|`druid.emitter.kafka.clusterName`|Optional value to specify name of your druid cluster. It can help make groups in your monitoring environment. |no|none|
+| property                                           | description                                                                                                                                                                             | required? | default               |
+|----------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|-----------------------|
+| `druid.emitter.kafka.bootstrap.servers`            | Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)                                                                                                                    | yes       | none                  |
+| `druid.emitter.kafka.event.types`                  | Comma-separated event types. <br/>Choices: alerts, metrics, requests, segmentMetadata                                                                                                   | no        | ["metrics", "alerts"] |
+| `druid.emitter.kafka.metric.topic`                 | Kafka topic name for emitter's target to emit service metric. If `event.types` contains `metrics`, this field cannot be left empty                                                      | no        | none                  |
+| `druid.emitter.kafka.alert.topic`                  | Kafka topic name for emitter's target to emit alert. If `event.types` contains `alerts`, this field cannot be left empty                                                                | no        | none                  |
+| `druid.emitter.kafka.request.topic`                | Kafka topic name for emitter's target to emit request logs. If `event.types` contains `requests`, this field cannot be left empty                                                       | no        | none                  |
+| `druid.emitter.kafka.segmentMetadata.topic`        | Kafka topic name for emitter's target to emit segments related metadata. If `event.types` contains `segmentMetadata`, this field cannot be left empty                                   | no        | none                  |
+| `druid.emitter.kafka.segmentMetadata.topic.format` | Format in which segment related metadata will be emitted. <br/>Choices: json, protobuf.<br/> If set to `protobuf`, then segment metadata is emitted in `DruidSegmentEvent.proto` format | no        | json                  |

Review Comment:
   For now, we have decided to just stick with the JSON implementation like other events. I'll update the PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] harinirajendran commented on pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "harinirajendran (via GitHub)" <gi...@apache.org>.
harinirajendran commented on PR #14281:
URL: https://github.com/apache/druid/pull/14281#issuecomment-1564503621

   @abhishekrb19 Can you give a +1 again? Fixed a bunch of static check failures.
   <img width="1344" alt="Screen Shot 2023-05-26 at 8 51 51 AM" src="https://github.com/apache/druid/assets/9054348/1067d4c7-c85d-4346-be96-8a819c02f6f5">
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] xvrl commented on a diff in pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "xvrl (via GitHub)" <gi...@apache.org>.
xvrl commented on code in PR #14281:
URL: https://github.com/apache/druid/pull/14281#discussion_r1210864200


##########
extensions-contrib/kafka-emitter/src/main/java/org/apache/druid/emitter/kafka/KafkaEmitter.java:
##########
@@ -173,8 +193,8 @@ public void emit(final Event event)
         EventMap map = event.toMap();
         if (config.getClusterName() != null) {
           map = map.asBuilder()
-                   .put("clusterName", config.getClusterName())
-                   .build();
+              .put("clusterName", config.getClusterName())
+              .build();

Review Comment:
   nit, the formatting change here seems unnecessary



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] harinirajendran commented on pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "harinirajendran (via GitHub)" <gi...@apache.org>.
harinirajendran commented on PR #14281:
URL: https://github.com/apache/druid/pull/14281#issuecomment-1573795070

   > @harinirajendran - You should avoid `force-push` in the future as we can't see the diff from the last commit anymore. can you describe the most recent change?
   
   Ahh okay! Wasn't aware of that. Will keep that in mind for future PRs. I did not make any code changes. Just rebased the code on top of the latest master and pushed it again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] harinirajendran commented on pull request #14281: OBSDATA-440 Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "harinirajendran (via GitHub)" <gi...@apache.org>.
harinirajendran commented on PR #14281:
URL: https://github.com/apache/druid/pull/14281#issuecomment-1548440052

   @nishantmonu51 :  I have added a new event type in this PR that would cause the following exception to be thrown in AmbariMetricsEmitter and DropWizardEmitter
   ```
   throw new ISE("unknown event type [%s]", event.getClass());
   ```
   Should I explicitly filter this new event type from these emitters?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] gianm commented on pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "gianm (via GitHub)" <gi...@apache.org>.
gianm commented on PR #14281:
URL: https://github.com/apache/druid/pull/14281#issuecomment-1550742340

   > That's a great point @gianm. Do you have a recommendation for a better place to emit this segment metadata event instead of this place to prevent this?
   
   I think for it to be "perfect" the best way to do it would be to emit in the place you emit here, but also have some other process that detects missed emits somehow and fixes them up by redoing the missed emits. This would be a lot more complex of an implementation, however. So I'd only recommend doing that if it seems worth it.
   
   To figure that out, I would consider the requirements here. What kind of things are likely to consume the emitted payloads? Could they tolerate either of the following conditions?
   
   - missed emits (segments that are published, but never emitted)
   - bogus emits (segments that are never published, but were emitted anyway)
   
   If one or both of these can be tolerated, the implementation becomes a lot simpler.
   
   If I understand correctly— the one you have in this PR is the "missed emit" scenario. It won't generate bogus emits, but it can potentially miss some.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] harinirajendran commented on pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "harinirajendran (via GitHub)" <gi...@apache.org>.
harinirajendran commented on PR #14281:
URL: https://github.com/apache/druid/pull/14281#issuecomment-1550207680

   > IMO it would make more sense to edit those emitters to ignore unknown event types.
   
   That makes sense @gianm. I'll update those emitters to ignore the new event type.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ektravel commented on a diff in pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "ektravel (via GitHub)" <gi...@apache.org>.
ektravel commented on code in PR #14281:
URL: https://github.com/apache/druid/pull/14281#discussion_r1197947936


##########
docs/development/extensions-contrib/kafka-emitter.md:
##########
@@ -36,20 +36,28 @@ to monitor the status of your Druid cluster with this extension.
 
 All the configuration parameters for the Kafka emitter are under `druid.emitter.kafka`.
 
-|property|description|required?|default|
-|--------|-----------|---------|-------|
-|`druid.emitter.kafka.bootstrap.servers`|Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)|yes|none|
-|`druid.emitter.kafka.metric.topic`|Kafka topic name for emitter's target to emit service metric.|yes|none|
-|`druid.emitter.kafka.alert.topic`|Kafka topic name for emitter's target to emit alert.|yes|none|
-|`druid.emitter.kafka.request.topic`|Kafka topic name for emitter's target to emit request logs. If left empty then request logs will not be sent to the Kafka topic.|no|none|
-|`druid.emitter.kafka.producer.config`|JSON formatted configuration which user want to set additional properties to Kafka producer.|no|none|
-|`druid.emitter.kafka.clusterName`|Optional value to specify name of your druid cluster. It can help make groups in your monitoring environment. |no|none|
+| property                                           | description                                                                                                                                                                             | required? | default               |
+|----------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|-----------------------|
+| `druid.emitter.kafka.bootstrap.servers`            | Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)                                                                                                                    | yes       | none                  |
+| `druid.emitter.kafka.event.types`                  | Comma-separated event types. <br/>Choices: alerts, metrics, requests, segmentMetadata                                                                                                   | no        | ["metrics", "alerts"] |
+| `druid.emitter.kafka.metric.topic`                 | Kafka topic name for emitter's target to emit service metric. If `event.types` contains `metrics`, this field cannot be left empty                                                      | no        | none                  |
+| `druid.emitter.kafka.alert.topic`                  | Kafka topic name for emitter's target to emit alert. If `event.types` contains `alerts`, this field cannot be left empty                                                                | no        | none                  |
+| `druid.emitter.kafka.request.topic`                | Kafka topic name for emitter's target to emit request logs. If `event.types` contains `requests`, this field cannot be left empty                                                       | no        | none                  |
+| `druid.emitter.kafka.segmentMetadata.topic`        | Kafka topic name for emitter's target to emit segments related metadata. If `event.types` contains `segmentMetadata`, this field cannot be left empty                                   | no        | none                  |
+| `druid.emitter.kafka.segmentMetadata.topic.format` | Format in which segment related metadata will be emitted. <br/>Choices: json, protobuf.<br/> If set to `protobuf`, then segment metadata is emitted in `DruidSegmentEvent.proto` format | no        | json                  |
+| `druid.emitter.kafka.producer.config`              | JSON formatted configuration which user want to set additional properties to Kafka producer.                                                                                            | no        | none                  |
+| `druid.emitter.kafka.clusterName`                  | Optional value to specify name of your druid cluster. It can help make groups in your monitoring environment.                                                                           | no        | none                  |
 
 ### Example
 
 ```
 druid.emitter.kafka.bootstrap.servers=hostname1:9092,hostname2:9092
-druid.emitter.kafka.metric.topic=druid-metric
+druid.emitter.kafka.event.types=["alerts", "requests", "segmentMetadata"]
 druid.emitter.kafka.alert.topic=druid-alert
+druid.emitter.kafka.request.topic=druid-request-logs
+druid.emitter.kafka.segmentMetadata.topic=druid-segment-metadata
+druid.emitter.kafka.segmentMetadata.topic.format=protobuf 
 druid.emitter.kafka.producer.config={"max.block.ms":10000}
 ```
+Whenever `druid.emitter.kafka.segmentMetadata.topic.format` field is updated, it is recommended to also update  `druid.emitter.kafka.segmentMetadata.topic` to avoid the same topic from getting polluted with different formats of segment metadata.

Review Comment:
   ```suggestion
   When you update `druid.emitter.kafka.segmentMetadata.topic.format`, it is recommended that you also update `druid.emitter.kafka.segmentMetadata.topic` to avoid the same topic from getting polluted with different formats of segment metadata.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] harinirajendran commented on pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "harinirajendran (via GitHub)" <gi...@apache.org>.
harinirajendran commented on PR #14281:
URL: https://github.com/apache/druid/pull/14281#issuecomment-1563575078

   @abhishekrb19 can u approve the PR again so that the build would get triggered?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekrb19 commented on pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "abhishekrb19 (via GitHub)" <gi...@apache.org>.
abhishekrb19 commented on PR #14281:
URL: https://github.com/apache/druid/pull/14281#issuecomment-1563601370

   @harinirajendran, CI gets triggered by committers - I've posted this in the dev slack channel, so someone should hopefully get to it soon.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] xvrl commented on a diff in pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "xvrl (via GitHub)" <gi...@apache.org>.
xvrl commented on code in PR #14281:
URL: https://github.com/apache/druid/pull/14281#discussion_r1210863423


##########
extensions-contrib/kafka-emitter/pom.xml:
##########
@@ -86,7 +86,6 @@
       <artifactId>slf4j-api</artifactId>
       <scope>provided</scope>
     </dependency>
-

Review Comment:
   nit, can we avoid unnecessary formatting changes here since we're not making change to this file?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] harinirajendran commented on a diff in pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "harinirajendran (via GitHub)" <gi...@apache.org>.
harinirajendran commented on code in PR #14281:
URL: https://github.com/apache/druid/pull/14281#discussion_r1201042926


##########
extensions-contrib/kafka-emitter/src/main/java/org/apache/druid/emitter/kafka/KafkaEmitterConfig.java:
##########
@@ -21,53 +21,115 @@
 
 import com.fasterxml.jackson.annotation.JsonCreator;
 import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonValue;
 import com.google.common.base.Preconditions;
 import com.google.common.collect.ImmutableMap;
+import com.google.common.collect.ImmutableSet;
+import org.apache.druid.java.util.common.StringUtils;
 import org.apache.kafka.clients.producer.ProducerConfig;
 
 import javax.annotation.Nullable;
+import java.util.HashSet;
 import java.util.Map;
+import java.util.Set;
 
 public class KafkaEmitterConfig
 {
+  public enum EventType
+  {
+    METRICS,
+    ALERTS,
+    REQUESTS,
+    SEGMENTMETADATA {
+      @Override
+      public String toString()
+      {
+        return "segmentMetadata";
+      }
+    };
+
+    @JsonValue
+    @Override
+    public String toString()
+    {
+      return StringUtils.toLowerCase(this.name());
+    }
 
+    @JsonCreator
+    public static EventType fromString(String name)
+    {
+      return valueOf(StringUtils.toUpperCase(name));
+    }
+  }
+
+  public static final Set<EventType> DEFAULT_EVENT_TYPES = ImmutableSet.of(EventType.ALERTS, EventType.METRICS);
   @JsonProperty(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG)
   private final String bootstrapServers;
-  @JsonProperty("metric.topic")
+  @Nullable @JsonProperty("event.types")
+  private final Set<EventType> eventTypes;
+  @Nullable @JsonProperty("metric.topic")
   private final String metricTopic;
-  @JsonProperty("alert.topic")
+  @Nullable @JsonProperty("alert.topic")
   private final String alertTopic;
   @Nullable @JsonProperty("request.topic")
   private final String requestTopic;
+  @Nullable @JsonProperty("segmentMetadata.topic")
+  private final String segmentMetadataTopic;
   @JsonProperty
   private final String clusterName;
   @JsonProperty("producer.config")
-  private Map<String, String> kafkaProducerConfig;
+  private final Map<String, String> kafkaProducerConfig;
 
   @JsonCreator
   public KafkaEmitterConfig(
       @JsonProperty(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG) String bootstrapServers,
-      @JsonProperty("metric.topic") String metricTopic,
-      @JsonProperty("alert.topic") String alertTopic,
+      @Nullable @JsonProperty("event.types") Set<EventType> eventTypes,
+      @Nullable @JsonProperty("metric.topic") String metricTopic,
+      @Nullable @JsonProperty("alert.topic") String alertTopic,
       @Nullable @JsonProperty("request.topic") String requestTopic,
+      @Nullable @JsonProperty("segmentMetadata.topic") String segmentMetadataTopic,
       @JsonProperty("clusterName") String clusterName,
       @JsonProperty("producer.config") @Nullable Map<String, String> kafkaProducerConfig
   )
   {
     this.bootstrapServers = Preconditions.checkNotNull(bootstrapServers, "bootstrap.servers can not be null");
-    this.metricTopic = Preconditions.checkNotNull(metricTopic, "metric.topic can not be null");
-    this.alertTopic = Preconditions.checkNotNull(alertTopic, "alert.topic can not be null");
-    this.requestTopic = requestTopic;
+    this.eventTypes = validateEventTypes(eventTypes, requestTopic);
+
+    this.metricTopic = this.eventTypes.contains(EventType.METRICS) ? Preconditions.checkNotNull(metricTopic, "metric.topic can not be null") : null;
+    this.alertTopic = this.eventTypes.contains(EventType.ALERTS) ? Preconditions.checkNotNull(alertTopic, "alert.topic can not be null") : null;
+    this.requestTopic = this.eventTypes.contains(EventType.REQUESTS) ? Preconditions.checkNotNull(requestTopic, "request.topic can not be null") : null;
+    this.segmentMetadataTopic = this.eventTypes.contains(EventType.SEGMENTMETADATA) ? Preconditions.checkNotNull(segmentMetadataTopic, "segmentMetadata.topic can not be null") : null;
     this.clusterName = clusterName;
     this.kafkaProducerConfig = kafkaProducerConfig == null ? ImmutableMap.of() : kafkaProducerConfig;
   }
 
+  private Set<EventType> validateEventTypes(Set<EventType> eventTypes, String requestTopic)

Review Comment:
   Renamed it to `maybeUpdateEventTypes` 



##########
extensions-contrib/kafka-emitter/src/main/java/org/apache/druid/emitter/kafka/KafkaEmitter.java:
##########
@@ -105,7 +111,7 @@ protected Producer<String, String> setKafkaProducer()
       Properties props = new Properties();
       props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, config.getBootstrapServers());
       props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
-      props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
+      props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, ByteArraySerializer.class.getName());

Review Comment:
   I changed it byte[] to handle the protobuf format. Nw that we are not adding protobuf support, reverted it back to string.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] harinirajendran commented on a diff in pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "harinirajendran (via GitHub)" <gi...@apache.org>.
harinirajendran commented on code in PR #14281:
URL: https://github.com/apache/druid/pull/14281#discussion_r1202558908


##########
extensions-contrib/kafka-emitter/src/test/java/org/apache/druid/emitter/kafka/KafkaEmitterTest.java:
##########
@@ -47,20 +51,23 @@
 @RunWith(Parameterized.class)
 public class KafkaEmitterTest
 {
-  @Parameterized.Parameter
+  @Parameterized.Parameter(0)
+  public Set<KafkaEmitterConfig.EventType> eventsType;
+
+  @Parameterized.Parameter(1)
   public String requestTopic;
 
-  @Parameterized.Parameters(name = "{index}: requestTopic - {0}")
+  @Parameterized.Parameters(name = "{index}: eventTypes - {0}, requestTopic - {1}")
   public static Object[] data()
   {
-    return new Object[] {
-        "requests",
-        null
+    return new Object[][] {
+        {new HashSet<>(Arrays.asList(KafkaEmitterConfig.EventType.METRICS, KafkaEmitterConfig.EventType.REQUESTS, KafkaEmitterConfig.EventType.ALERTS, KafkaEmitterConfig.EventType.SEGMENTMETADATA)), "requests"},
+        {new HashSet<>(Arrays.asList(KafkaEmitterConfig.EventType.METRICS, KafkaEmitterConfig.EventType.ALERTS, KafkaEmitterConfig.EventType.SEGMENTMETADATA)), null}
     };
   }
 
   // there is 1 seconds wait in kafka emitter before it starts sending events to broker, set a timeout for 5 seconds

Review Comment:
   yes. atleast in my laptop it did fail a few times.



##########
extensions-contrib/kafka-emitter/src/main/java/org/apache/druid/emitter/kafka/KafkaEmitter.java:
##########
@@ -183,24 +202,31 @@ public void emit(final Event event)
             resultJson,
             StringUtils.toUtf8(resultJson).length
         );
+
+        Set<EventType> eventTypes = config.getEventTypes();
         if (event instanceof ServiceMetricEvent) {
-          if (!metricQueue.offer(objectContainer)) {
+          if (!eventTypes.contains(EventType.METRICS) || !metricQueue.offer(objectContainer)) {
             metricLost.incrementAndGet();
           }
         } else if (event instanceof AlertEvent) {
-          if (!alertQueue.offer(objectContainer)) {
+          if (!eventTypes.contains(EventType.ALERTS) || !alertQueue.offer(objectContainer)) {
             alertLost.incrementAndGet();
           }
         } else if (event instanceof RequestLogEvent) {
-          if (config.getRequestTopic() == null || !requestQueue.offer(objectContainer)) {
+          if (!eventTypes.contains(EventType.REQUESTS) || !requestQueue.offer(objectContainer)) {
             requestLost.incrementAndGet();
           }
+        } else if (event instanceof SegmentMetadataEvent) {
+          if (!eventTypes.contains(EventType.SEGMENTMETADATA) || !segmentMetadataQueue.offer(objectContainer)) {
+            segmentMetadataLost.incrementAndGet();
+          }
         } else {
           invalidLost.incrementAndGet();
         }
       }
-      catch (JsonProcessingException e) {
+      catch (Exception e) {
         invalidLost.incrementAndGet();
+        log.warn(e, "Exception while serializing event");

Review Comment:
   now that we are not doing protobuf, I changed it back to JsonProcessingException. So, leaving this line as is.



##########
extensions-contrib/kafka-emitter/src/main/java/org/apache/druid/emitter/kafka/KafkaEmitter.java:
##########
@@ -183,24 +202,31 @@ public void emit(final Event event)
             resultJson,
             StringUtils.toUtf8(resultJson).length
         );
+
+        Set<EventType> eventTypes = config.getEventTypes();
         if (event instanceof ServiceMetricEvent) {
-          if (!metricQueue.offer(objectContainer)) {
+          if (!eventTypes.contains(EventType.METRICS) || !metricQueue.offer(objectContainer)) {
             metricLost.incrementAndGet();
           }
         } else if (event instanceof AlertEvent) {
-          if (!alertQueue.offer(objectContainer)) {
+          if (!eventTypes.contains(EventType.ALERTS) || !alertQueue.offer(objectContainer)) {
             alertLost.incrementAndGet();
           }
         } else if (event instanceof RequestLogEvent) {
-          if (config.getRequestTopic() == null || !requestQueue.offer(objectContainer)) {
+          if (!eventTypes.contains(EventType.REQUESTS) || !requestQueue.offer(objectContainer)) {
             requestLost.incrementAndGet();
           }
+        } else if (event instanceof SegmentMetadataEvent) {
+          if (!eventTypes.contains(EventType.SEGMENTMETADATA) || !segmentMetadataQueue.offer(objectContainer)) {
+            segmentMetadataLost.incrementAndGet();
+          }
         } else {
           invalidLost.incrementAndGet();
         }
       }
-      catch (JsonProcessingException e) {

Review Comment:
   Changed it back



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] harinirajendran commented on pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "harinirajendran (via GitHub)" <gi...@apache.org>.
harinirajendran commented on PR #14281:
URL: https://github.com/apache/druid/pull/14281#issuecomment-1571937252

   @xvrl @abhishekagarwal87 I ran `org.apache.druid.java.util.emitter.core.HttpEmitterConfigTest.testDefaultsLegacy` test locally in my laptop and it passed. But it failed twice when we trigger it as a part of this PR. What do we do in such cases?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ektravel commented on a diff in pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "ektravel (via GitHub)" <gi...@apache.org>.
ektravel commented on code in PR #14281:
URL: https://github.com/apache/druid/pull/14281#discussion_r1197962443


##########
docs/development/extensions-contrib/kafka-emitter.md:
##########
@@ -36,20 +36,28 @@ to monitor the status of your Druid cluster with this extension.
 
 All the configuration parameters for the Kafka emitter are under `druid.emitter.kafka`.
 
-|property|description|required?|default|
-|--------|-----------|---------|-------|
-|`druid.emitter.kafka.bootstrap.servers`|Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)|yes|none|
-|`druid.emitter.kafka.metric.topic`|Kafka topic name for emitter's target to emit service metric.|yes|none|
-|`druid.emitter.kafka.alert.topic`|Kafka topic name for emitter's target to emit alert.|yes|none|
-|`druid.emitter.kafka.request.topic`|Kafka topic name for emitter's target to emit request logs. If left empty then request logs will not be sent to the Kafka topic.|no|none|
-|`druid.emitter.kafka.producer.config`|JSON formatted configuration which user want to set additional properties to Kafka producer.|no|none|
-|`druid.emitter.kafka.clusterName`|Optional value to specify name of your druid cluster. It can help make groups in your monitoring environment. |no|none|
+| property                                           | description                                                                                                                                                                             | required? | default               |
+|----------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|-----------------------|
+| `druid.emitter.kafka.bootstrap.servers`            | Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)                                                                                                                    | yes       | none                  |
+| `druid.emitter.kafka.event.types`                  | Comma-separated event types. <br/>Choices: alerts, metrics, requests, segmentMetadata                                                                                                   | no        | ["metrics", "alerts"] |
+| `druid.emitter.kafka.metric.topic`                 | Kafka topic name for emitter's target to emit service metric. If `event.types` contains `metrics`, this field cannot be left empty                                                      | no        | none                  |
+| `druid.emitter.kafka.alert.topic`                  | Kafka topic name for emitter's target to emit alert. If `event.types` contains `alerts`, this field cannot be left empty                                                                | no        | none                  |
+| `druid.emitter.kafka.request.topic`                | Kafka topic name for emitter's target to emit request logs. If `event.types` contains `requests`, this field cannot be left empty                                                       | no        | none                  |
+| `druid.emitter.kafka.segmentMetadata.topic`        | Kafka topic name for emitter's target to emit segments related metadata. If `event.types` contains `segmentMetadata`, this field cannot be left empty                                   | no        | none                  |

Review Comment:
   ```suggestion
   | `druid.emitter.kafka.segmentMetadata.topic`        | Kafka topic name for emitter's target to emit segments related metadata. If `event.types` contains `segmentMetadata`, this field cannot be empty.                                   | no        | none                  |
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ektravel commented on a diff in pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "ektravel (via GitHub)" <gi...@apache.org>.
ektravel commented on code in PR #14281:
URL: https://github.com/apache/druid/pull/14281#discussion_r1197961574


##########
docs/development/extensions-contrib/kafka-emitter.md:
##########
@@ -36,20 +36,28 @@ to monitor the status of your Druid cluster with this extension.
 
 All the configuration parameters for the Kafka emitter are under `druid.emitter.kafka`.
 
-|property|description|required?|default|
-|--------|-----------|---------|-------|
-|`druid.emitter.kafka.bootstrap.servers`|Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)|yes|none|
-|`druid.emitter.kafka.metric.topic`|Kafka topic name for emitter's target to emit service metric.|yes|none|
-|`druid.emitter.kafka.alert.topic`|Kafka topic name for emitter's target to emit alert.|yes|none|
-|`druid.emitter.kafka.request.topic`|Kafka topic name for emitter's target to emit request logs. If left empty then request logs will not be sent to the Kafka topic.|no|none|
-|`druid.emitter.kafka.producer.config`|JSON formatted configuration which user want to set additional properties to Kafka producer.|no|none|
-|`druid.emitter.kafka.clusterName`|Optional value to specify name of your druid cluster. It can help make groups in your monitoring environment. |no|none|
+| property                                           | description                                                                                                                                                                             | required? | default               |
+|----------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|-----------------------|
+| `druid.emitter.kafka.bootstrap.servers`            | Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)                                                                                                                    | yes       | none                  |
+| `druid.emitter.kafka.event.types`                  | Comma-separated event types. <br/>Choices: alerts, metrics, requests, segmentMetadata                                                                                                   | no        | ["metrics", "alerts"] |
+| `druid.emitter.kafka.metric.topic`                 | Kafka topic name for emitter's target to emit service metric. If `event.types` contains `metrics`, this field cannot be left empty                                                      | no        | none                  |
+| `druid.emitter.kafka.alert.topic`                  | Kafka topic name for emitter's target to emit alert. If `event.types` contains `alerts`, this field cannot be left empty                                                                | no        | none                  |

Review Comment:
   ```suggestion
   | `druid.emitter.kafka.alert.topic`                  | Kafka topic name for emitter's target to emit alerts. If `event.types` contains `alerts`, this field cannot empty.                                                                | no        | none                  |
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 commented on a diff in pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "abhishekagarwal87 (via GitHub)" <gi...@apache.org>.
abhishekagarwal87 commented on code in PR #14281:
URL: https://github.com/apache/druid/pull/14281#discussion_r1211482293


##########
extensions-contrib/kafka-emitter/src/main/java/org/apache/druid/emitter/kafka/KafkaEmitter.java:
##########
@@ -119,18 +126,26 @@ protected Producer<String, String> setKafkaProducer()
   @Override
   public void start()
   {
-    scheduler.schedule(this::sendMetricToKafka, sendInterval, TimeUnit.SECONDS);
-    scheduler.schedule(this::sendAlertToKafka, sendInterval, TimeUnit.SECONDS);
-    if (config.getRequestTopic() != null) {
+    Set<EventType> eventTypes = config.getEventTypes();
+    if (eventTypes.contains(EventType.METRICS)) {
+      scheduler.schedule(this::sendMetricToKafka, sendInterval, TimeUnit.SECONDS);
+    }
+    if (eventTypes.contains(EventType.ALERTS)) {
+      scheduler.schedule(this::sendAlertToKafka, sendInterval, TimeUnit.SECONDS);
+    }
+    if (eventTypes.contains(EventType.REQUESTS)) {
       scheduler.schedule(this::sendRequestToKafka, sendInterval, TimeUnit.SECONDS);
     }
+    if (eventTypes.contains(EventType.SEGMENT_METADATA)) {
+      scheduler.schedule(this::sendSegmentMetadataToKafka, sendInterval, TimeUnit.SECONDS);
+    }
     scheduler.scheduleWithFixedDelay(() -> {
-      log.info(
-          "Message lost counter: metricLost=[%d], alertLost=[%d], requestLost=[%d], invalidLost=[%d]",
+      log.info("Message lost counter: metricLost=[%d], alertLost=[%d], requestLost=[%d], invalidLost=[%d] segmentMetadataLost=[%d]",

Review Comment:
   nit. lets have invalidLost at the end. 



##########
extensions-contrib/kafka-emitter/src/main/java/org/apache/druid/emitter/kafka/KafkaEmitter.java:
##########
@@ -183,24 +203,31 @@ public void emit(final Event event)
             resultJson,
             StringUtils.toUtf8(resultJson).length
         );
+
+        Set<EventType> eventTypes = config.getEventTypes();
         if (event instanceof ServiceMetricEvent) {
-          if (!metricQueue.offer(objectContainer)) {
+          if (!eventTypes.contains(EventType.METRICS) || !metricQueue.offer(objectContainer)) {

Review Comment:
   why do we want to increment metricsLost in this case? Since not reporting is intentional. 



##########
processing/src/main/java/org/apache/druid/java/util/emitter/service/SegmentMetadataEvent.java:
##########
@@ -0,0 +1,81 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.java.util.emitter.service;
+
+import com.fasterxml.jackson.annotation.JsonValue;
+import org.apache.druid.java.util.emitter.core.Event;
+import org.apache.druid.java.util.emitter.core.EventMap;
+import org.joda.time.DateTime;
+
+public class SegmentMetadataEvent implements Event

Review Comment:
   please add javadocs about the class. 



##########
extensions-contrib/kafka-emitter/src/main/java/org/apache/druid/emitter/kafka/KafkaEmitterConfig.java:
##########
@@ -21,53 +21,108 @@
 
 import com.fasterxml.jackson.annotation.JsonCreator;
 import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonValue;
 import com.google.common.base.Preconditions;
 import com.google.common.collect.ImmutableMap;
+import com.google.common.collect.ImmutableSet;
+import org.apache.druid.java.util.common.StringUtils;
 import org.apache.kafka.clients.producer.ProducerConfig;
 
 import javax.annotation.Nullable;
+import java.util.HashSet;
 import java.util.Map;
+import java.util.Set;
 
 public class KafkaEmitterConfig
 {
+  public enum EventType
+  {
+    METRICS,
+    ALERTS,
+    REQUESTS,
+    SEGMENT_METADATA;
+
+    @JsonValue
+    @Override
+    public String toString()
+    {
+      return StringUtils.toLowerCase(this.name());
+    }
 
+    @JsonCreator
+    public static EventType fromString(String name)
+    {
+      return valueOf(StringUtils.toUpperCase(name));
+    }
+  }
+
+  public static final Set<EventType> DEFAULT_EVENT_TYPES = ImmutableSet.of(EventType.ALERTS, EventType.METRICS);
   @JsonProperty(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG)
   private final String bootstrapServers;
-  @JsonProperty("metric.topic")
+  @Nullable @JsonProperty("event.types")
+  private final Set<EventType> eventTypes;
+  @Nullable @JsonProperty("metric.topic")
   private final String metricTopic;
-  @JsonProperty("alert.topic")
+  @Nullable @JsonProperty("alert.topic")
   private final String alertTopic;
   @Nullable @JsonProperty("request.topic")
   private final String requestTopic;
+  @Nullable @JsonProperty("segmentMetadata.topic")
+  private final String segmentMetadataTopic;
   @JsonProperty
   private final String clusterName;
   @JsonProperty("producer.config")
-  private Map<String, String> kafkaProducerConfig;
+  private final Map<String, String> kafkaProducerConfig;
 
   @JsonCreator
   public KafkaEmitterConfig(
       @JsonProperty(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG) String bootstrapServers,
-      @JsonProperty("metric.topic") String metricTopic,
-      @JsonProperty("alert.topic") String alertTopic,
+      @Nullable @JsonProperty("event.types") Set<EventType> eventTypes,
+      @Nullable @JsonProperty("metric.topic") String metricTopic,
+      @Nullable @JsonProperty("alert.topic") String alertTopic,
       @Nullable @JsonProperty("request.topic") String requestTopic,
+      @Nullable @JsonProperty("segmentMetadata.topic") String segmentMetadataTopic,
       @JsonProperty("clusterName") String clusterName,
       @JsonProperty("producer.config") @Nullable Map<String, String> kafkaProducerConfig
   )
   {
     this.bootstrapServers = Preconditions.checkNotNull(bootstrapServers, "bootstrap.servers can not be null");
-    this.metricTopic = Preconditions.checkNotNull(metricTopic, "metric.topic can not be null");
-    this.alertTopic = Preconditions.checkNotNull(alertTopic, "alert.topic can not be null");
-    this.requestTopic = requestTopic;
+    this.eventTypes = maybeUpdateEventTypes(eventTypes, requestTopic);
+    this.metricTopic = this.eventTypes.contains(EventType.METRICS) ? Preconditions.checkNotNull(metricTopic, "metric.topic can not be null") : null;

Review Comment:
   unrelated to your PR but can the error message include the full config name. E.g. `druid.emitter.kafka.metric.topic can not be null` 
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] xvrl commented on a diff in pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "xvrl (via GitHub)" <gi...@apache.org>.
xvrl commented on code in PR #14281:
URL: https://github.com/apache/druid/pull/14281#discussion_r1210869846


##########
indexing-service/src/main/java/org/apache/druid/indexing/common/actions/SegmentTransactionalInsertAction.java:
##########
@@ -257,11 +260,27 @@ public SegmentPublishResult perform(Task task, TaskActionToolbox toolbox)
           segment.getShardSpec() == null ? null : segment.getShardSpec().getType()
       );
       toolbox.getEmitter().emit(metricBuilder.build("segment/added/bytes", segment.getSize()));
+      // Emit the segment related metadata using the configured emitters
+      this.emitSegmentMetadata(segment, toolbox);

Review Comment:
   can we add a comment that captures the limitation that @gianm pointed out, explaining that some segmentmetadata events may not be captured in the event of server crash?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] xvrl commented on pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "xvrl (via GitHub)" <gi...@apache.org>.
xvrl commented on PR #14281:
URL: https://github.com/apache/druid/pull/14281#issuecomment-1569160225

   @harinirajendran can we update the PR description to reflect the current state of the code. It still references the protobuf implementation which is no longer part of this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] harinirajendran commented on pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "harinirajendran (via GitHub)" <gi...@apache.org>.
harinirajendran commented on PR #14281:
URL: https://github.com/apache/druid/pull/14281#issuecomment-1570488949

   > @harinirajendran can we update the PR description to reflect the current state of the code. It still references the protobuf implementation which is no longer part of this.
   
   Updated it @xvrl . will address other comments now


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] harinirajendran commented on pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "harinirajendran (via GitHub)" <gi...@apache.org>.
harinirajendran commented on PR #14281:
URL: https://github.com/apache/druid/pull/14281#issuecomment-1572235562

   > @harinirajendran - We will have to fix it. When you run locally, are you running just one test? I will suggest running the same maven command that github action is running. I think that you will run into the same error then.
   
   Yeah I was just running that one test. let me try using the same command and see if I can reproduce it. Thanks @abhishekagarwal87 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 merged pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "abhishekagarwal87 (via GitHub)" <gi...@apache.org>.
abhishekagarwal87 merged PR #14281:
URL: https://github.com/apache/druid/pull/14281


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekrb19 commented on a diff in pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "abhishekrb19 (via GitHub)" <gi...@apache.org>.
abhishekrb19 commented on code in PR #14281:
URL: https://github.com/apache/druid/pull/14281#discussion_r1195803064


##########
docs/development/extensions-contrib/kafka-emitter.md:
##########
@@ -36,20 +36,28 @@ to monitor the status of your Druid cluster with this extension.
 
 All the configuration parameters for the Kafka emitter are under `druid.emitter.kafka`.
 
-|property|description|required?|default|
-|--------|-----------|---------|-------|
-|`druid.emitter.kafka.bootstrap.servers`|Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)|yes|none|
-|`druid.emitter.kafka.metric.topic`|Kafka topic name for emitter's target to emit service metric.|yes|none|
-|`druid.emitter.kafka.alert.topic`|Kafka topic name for emitter's target to emit alert.|yes|none|
-|`druid.emitter.kafka.request.topic`|Kafka topic name for emitter's target to emit request logs. If left empty then request logs will not be sent to the Kafka topic.|no|none|
-|`druid.emitter.kafka.producer.config`|JSON formatted configuration which user want to set additional properties to Kafka producer.|no|none|
-|`druid.emitter.kafka.clusterName`|Optional value to specify name of your druid cluster. It can help make groups in your monitoring environment. |no|none|
+| property                                           | description                                                                                                                                                                             | required? | default               |
+|----------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|-----------------------|
+| `druid.emitter.kafka.bootstrap.servers`            | Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)                                                                                                                    | yes       | none                  |
+| `druid.emitter.kafka.event.types`                  | Comma-separated event types. <br/>Choices: alerts, metrics, requests, segmentMetadata                                                                                                   | no        | ["metrics", "alerts"] |
+| `druid.emitter.kafka.metric.topic`                 | Kafka topic name for emitter's target to emit service metric. If `event.types` contains `metrics`, this field cannot be left empty                                                      | no        | none                  |
+| `druid.emitter.kafka.alert.topic`                  | Kafka topic name for emitter's target to emit alert. If `event.types` contains `alerts`, this field cannot be left empty                                                                | no        | none                  |
+| `druid.emitter.kafka.request.topic`                | Kafka topic name for emitter's target to emit request logs. If `event.types` contains `requests`, this field cannot be left empty                                                       | no        | none                  |
+| `druid.emitter.kafka.segmentMetadata.topic`        | Kafka topic name for emitter's target to emit segments related metadata. If `event.types` contains `segmentMetadata`, this field cannot be left empty                                   | no        | none                  |
+| `druid.emitter.kafka.segmentMetadata.topic.format` | Format in which segment related metadata will be emitted. <br/>Choices: json, protobuf.<br/> If set to `protobuf`, then segment metadata is emitted in `DruidSegmentEvent.proto` format | no        | json                  |

Review Comment:
   I'm curious about the need to add protobuf encoding (and thereby a dependency) here when we can get away with `json` or a byte format? The output format, including json, can always be made backwards compatible.  And if downstream consumers want to consume the topics as a proto encoded message, that should still be possible by unmarshaling json/bytes into a proto struct as needed?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ektravel commented on a diff in pull request #14281: Adding SegmentMetadataEvent and publishing them via KafkaEmitter

Posted by "ektravel (via GitHub)" <gi...@apache.org>.
ektravel commented on code in PR #14281:
URL: https://github.com/apache/druid/pull/14281#discussion_r1197949442


##########
docs/development/extensions-contrib/kafka-emitter.md:
##########
@@ -36,20 +36,28 @@ to monitor the status of your Druid cluster with this extension.
 
 All the configuration parameters for the Kafka emitter are under `druid.emitter.kafka`.
 
-|property|description|required?|default|
-|--------|-----------|---------|-------|
-|`druid.emitter.kafka.bootstrap.servers`|Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`)|yes|none|
-|`druid.emitter.kafka.metric.topic`|Kafka topic name for emitter's target to emit service metric.|yes|none|
-|`druid.emitter.kafka.alert.topic`|Kafka topic name for emitter's target to emit alert.|yes|none|
-|`druid.emitter.kafka.request.topic`|Kafka topic name for emitter's target to emit request logs. If left empty then request logs will not be sent to the Kafka topic.|no|none|
-|`druid.emitter.kafka.producer.config`|JSON formatted configuration which user want to set additional properties to Kafka producer.|no|none|
-|`druid.emitter.kafka.clusterName`|Optional value to specify name of your druid cluster. It can help make groups in your monitoring environment. |no|none|
+| property                                           | description                                                                                                                                                                             | required? | default               |

Review Comment:
   ```suggestion
   | Property                                           | Description                                                                                                                                                                             | Required | Default               |
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org