You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2021/03/26 15:19:50 UTC

[GitHub] [druid] pjain1 opened a new pull request #11036: request logs through kafka emitter

pjain1 opened a new pull request #11036:
URL: https://github.com/apache/druid/pull/11036


   Fixes https://github.com/apache/druid/issues/10851
   
   ### Description
   
   Send request logs through Kafka Emitter, if request topic is set then the event will be emitted otherwise skipped and added to `invalidLost` count.
   
   <hr>
   
   ##### Key changed/added classes in this PR
    * `KafkaEmitter.java`
    * `DefaultRequestLogEvent.java`
   
   <hr>
   
   This PR has:
   - [x] been self-reviewed.
      - [ ] using the [concurrency checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md) (Remove this item if the PR doesn't have any relation to concurrency.)
   - [x] added documentation for new or modified features or behaviors.
   - [x] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
   - [ ] added or updated version, license, or notice information in [licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md)
   - [ ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
   - [x] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met.
   - [ ] added integration tests.
   - [x] been tested in a test Druid cluster.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] pjain1 commented on pull request #11036: request logs through kafka emitter

Posted by GitBox <gi...@apache.org>.
pjain1 commented on pull request #11036:
URL: https://github.com/apache/druid/pull/11036#issuecomment-810012177


   Tests failing due to insufficient test coverage.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] capistrant commented on pull request #11036: request logs through kafka emitter

Posted by GitBox <gi...@apache.org>.
capistrant commented on pull request #11036:
URL: https://github.com/apache/druid/pull/11036#issuecomment-811454774


   > @capistrant updated, hope it makes sense.
   > 
   > Although looking at the emitter code again, I found an issue - memory bound for each queue is set to the Kafka producer buffer memory config. I believe the intention is keep the buffered events size less than or equal to the producer buffer but it will actually be 3 times the producer buffer. This problem would have been there since the implementation of this emitter so I think can be addressed in separate PR.
   
   thanks! Do you think you can open up an issue with a quick write up on that suspected bug? I agree that it can be fixed in a separate PR.
   
   I did ask a small question in a review comment. I don't mean to come in and nit at this since it is already approved. Just curious to know the reasoning on a small choice.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] pjain1 commented on pull request #11036: request logs through kafka emitter

Posted by GitBox <gi...@apache.org>.
pjain1 commented on pull request #11036:
URL: https://github.com/apache/druid/pull/11036#issuecomment-811442929


   @capistrant updated, hope it makes sense. 
   
   Although looking at the emitter code again, I found an issue - memory bound for each queue is set to the Kafka producer buffer memory config. I believe the intention is keep the buffered events size less than or equal to the producer buffer but it will actually be 3 times the producer buffer. This problem would have been there since the implementation of this emitter so I think can be addressed in separate PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] zhangyue19921010 commented on a change in pull request #11036: request logs through kafka emitter

Posted by GitBox <gi...@apache.org>.
zhangyue19921010 commented on a change in pull request #11036:
URL: https://github.com/apache/druid/pull/11036#discussion_r602995267



##########
File path: server/src/main/java/org/apache/druid/server/log/DefaultRequestLogEvent.java
##########
@@ -56,7 +57,21 @@
   @Override
   public Map<String, Object> toMap()
   {
-    return ImmutableMap.of();
+    final Map<String, Object> map = new HashMap<>();
+    map.put("feed", getFeed());
+    map.put("timestamp", getCreatedTime());
+    map.put("service", getService());
+    map.put("host", getHost());
+    if (getQuery() != null) {
+      map.put("query", getQuery());
+    }
+    if (getSql() != null) {
+      map.put("sql", getSql());
+    }
+    map.put("sqlQueryContext", getSqlQueryContext());

Review comment:
       Here we need to distinguish between forNative and forSql.
   `forNative` : sql and sqlQueryContext are null,
   `forSql` : query is null.
   So that maybe we need to do 
   ```
       if (getSqlQueryContext() != null) {
         map.put("sqlQueryContext", getSqlQueryContext());
       }
   ```
   

##########
File path: extensions-contrib/kafka-emitter/src/main/java/org/apache/druid/emitter/kafka/KafkaEmitter.java
##########
@@ -111,6 +116,9 @@ public void start()
   {
     scheduler.schedule(this::sendMetricToKafka, 10, TimeUnit.SECONDS);
     scheduler.schedule(this::sendAlertToKafka, 10, TimeUnit.SECONDS);
+    if (config.getRequestTopic() != null) {
+      scheduler.schedule(this::sendRequestToKafka, 10, TimeUnit.SECONDS);
+    }
     scheduler.scheduleWithFixedDelay(() -> {
       log.info("Message lost counter: metricLost=[%d], alertLost=[%d], invalidLost=[%d]",
                metricLost.get(), alertLost.get(), invalidLost.get());

Review comment:
       nit : maybe need to add requestLost info here.
   something like 
   `log.info("Message lost counter: metricLost=[%d], alertLost=[%d], requestLost=[%d], invalidLost=[%d]", metricLost.get(), alertLost.get(), requestLost.get(), invalidLost.get());`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] pjain1 commented on pull request #11036: request logs through kafka emitter

Posted by GitBox <gi...@apache.org>.
pjain1 commented on pull request #11036:
URL: https://github.com/apache/druid/pull/11036#issuecomment-811031669


   All green, will merge by eod.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] pjain1 commented on pull request #11036: request logs through kafka emitter

Posted by GitBox <gi...@apache.org>.
pjain1 commented on pull request #11036:
URL: https://github.com/apache/druid/pull/11036#issuecomment-810329631


   added unit test for kafka emitter


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] pjain1 commented on a change in pull request #11036: request logs through kafka emitter

Posted by GitBox <gi...@apache.org>.
pjain1 commented on a change in pull request #11036:
URL: https://github.com/apache/druid/pull/11036#discussion_r603028268



##########
File path: server/src/main/java/org/apache/druid/server/log/DefaultRequestLogEvent.java
##########
@@ -56,7 +57,21 @@
   @Override
   public Map<String, Object> toMap()
   {
-    return ImmutableMap.of();
+    final Map<String, Object> map = new HashMap<>();
+    map.put("feed", getFeed());
+    map.put("timestamp", getCreatedTime());
+    map.put("service", getService());
+    map.put("host", getHost());
+    if (getQuery() != null) {
+      map.put("query", getQuery());
+    }
+    if (getSql() != null) {
+      map.put("sql", getSql());
+    }
+    map.put("sqlQueryContext", getSqlQueryContext());

Review comment:
       done, thanks




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] pjain1 commented on a change in pull request #11036: request logs through kafka emitter

Posted by GitBox <gi...@apache.org>.
pjain1 commented on a change in pull request #11036:
URL: https://github.com/apache/druid/pull/11036#discussion_r605273417



##########
File path: extensions-contrib/kafka-emitter/src/main/java/org/apache/druid/emitter/kafka/KafkaEmitter.java
##########
@@ -166,6 +184,10 @@ public void emit(final Event event)
           if (!alertQueue.offer(objectContainer)) {
             alertLost.incrementAndGet();
           }
+        } else if (event instanceof RequestLogEvent && config.getRequestTopic() != null) {

Review comment:
       This is to keep the behaviour same as before where request log events were counted as `invalid`, now they will start seeing `requestLost` as 'N/A' in the logs if request topic is not set and the count in `invalidLost`.  But it may make more sense now to count them as `requestLost` as request log event is now supported.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] capistrant commented on a change in pull request #11036: request logs through kafka emitter

Posted by GitBox <gi...@apache.org>.
capistrant commented on a change in pull request #11036:
URL: https://github.com/apache/druid/pull/11036#discussion_r605209281



##########
File path: extensions-contrib/kafka-emitter/src/main/java/org/apache/druid/emitter/kafka/KafkaEmitter.java
##########
@@ -166,6 +184,10 @@ public void emit(final Event event)
           if (!alertQueue.offer(objectContainer)) {
             alertLost.incrementAndGet();
           }
+        } else if (event instanceof RequestLogEvent && config.getRequestTopic() != null) {

Review comment:
       one question, why not still use `requestLost` as the counter for lost events instead of putting them in the bucket of `invalid` events which seems to previously have been there just for cases where the object isn't a supported event type?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] capistrant commented on pull request #11036: request logs through kafka emitter

Posted by GitBox <gi...@apache.org>.
capistrant commented on pull request #11036:
URL: https://github.com/apache/druid/pull/11036#issuecomment-811424608


   @pjain1 This is a great add, will be very useful! Can you update the description with the information on logging now that lost events are logged under `requestLost` after latest commits? That way we won't worry about someone reading this description and not looking at the code if they are troubleshooting. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] pjain1 merged pull request #11036: request logs through kafka emitter

Posted by GitBox <gi...@apache.org>.
pjain1 merged pull request #11036:
URL: https://github.com/apache/druid/pull/11036


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] capistrant commented on a change in pull request #11036: request logs through kafka emitter

Posted by GitBox <gi...@apache.org>.
capistrant commented on a change in pull request #11036:
URL: https://github.com/apache/druid/pull/11036#discussion_r605209281



##########
File path: extensions-contrib/kafka-emitter/src/main/java/org/apache/druid/emitter/kafka/KafkaEmitter.java
##########
@@ -166,6 +184,10 @@ public void emit(final Event event)
           if (!alertQueue.offer(objectContainer)) {
             alertLost.incrementAndGet();
           }
+        } else if (event instanceof RequestLogEvent && config.getRequestTopic() != null) {

Review comment:
       one question, why not still use `requestLost` as the counter for lost events instead of putting them in the bucket of `invalid` events which seems to be there for cases where the object isn't a supported event type?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] pjain1 commented on a change in pull request #11036: request logs through kafka emitter

Posted by GitBox <gi...@apache.org>.
pjain1 commented on a change in pull request #11036:
URL: https://github.com/apache/druid/pull/11036#discussion_r603028202



##########
File path: extensions-contrib/kafka-emitter/src/main/java/org/apache/druid/emitter/kafka/KafkaEmitter.java
##########
@@ -111,6 +116,9 @@ public void start()
   {
     scheduler.schedule(this::sendMetricToKafka, 10, TimeUnit.SECONDS);
     scheduler.schedule(this::sendAlertToKafka, 10, TimeUnit.SECONDS);
+    if (config.getRequestTopic() != null) {
+      scheduler.schedule(this::sendRequestToKafka, 10, TimeUnit.SECONDS);
+    }
     scheduler.scheduleWithFixedDelay(() -> {
       log.info("Message lost counter: metricLost=[%d], alertLost=[%d], invalidLost=[%d]",
                metricLost.get(), alertLost.get(), invalidLost.get());

Review comment:
       done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] pjain1 commented on a change in pull request #11036: request logs through kafka emitter

Posted by GitBox <gi...@apache.org>.
pjain1 commented on a change in pull request #11036:
URL: https://github.com/apache/druid/pull/11036#discussion_r605275525



##########
File path: extensions-contrib/kafka-emitter/src/main/java/org/apache/druid/emitter/kafka/KafkaEmitter.java
##########
@@ -166,6 +184,10 @@ public void emit(final Event event)
           if (!alertQueue.offer(objectContainer)) {
             alertLost.incrementAndGet();
           }
+        } else if (event instanceof RequestLogEvent && config.getRequestTopic() != null) {

Review comment:
       made that change

##########
File path: extensions-contrib/kafka-emitter/src/main/java/org/apache/druid/emitter/kafka/KafkaEmitter.java
##########
@@ -166,6 +184,10 @@ public void emit(final Event event)
           if (!alertQueue.offer(objectContainer)) {
             alertLost.incrementAndGet();
           }
+        } else if (event instanceof RequestLogEvent && config.getRequestTopic() != null) {

Review comment:
       made that change @capistrant 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] capistrant commented on a change in pull request #11036: request logs through kafka emitter

Posted by GitBox <gi...@apache.org>.
capistrant commented on a change in pull request #11036:
URL: https://github.com/apache/druid/pull/11036#discussion_r605209281



##########
File path: extensions-contrib/kafka-emitter/src/main/java/org/apache/druid/emitter/kafka/KafkaEmitter.java
##########
@@ -166,6 +184,10 @@ public void emit(final Event event)
           if (!alertQueue.offer(objectContainer)) {
             alertLost.incrementAndGet();
           }
+        } else if (event instanceof RequestLogEvent && config.getRequestTopic() != null) {

Review comment:
       one question, why not still use `requestLost` as the counter for lost events due to a null topic name instead of putting them in the bucket of `invalid` events which seems to previously have been there just for cases where the object isn't a supported event type?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org