You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2022/10/10 04:01:11 UTC

[GitHub] [druid] LakshSingla opened a new pull request, #13198: MSQ: Report the warning directly as an error if none of it is allowed by the user

LakshSingla opened a new pull request, #13198:
URL: https://github.com/apache/druid/pull/13198

   ### Description
   
   In MSQ, there can be an upper limit to the number of worker warnings. For example, for parseExceptions encountered while parsing the external data, the user can specify an upper limit to the number of parse exceptions that can be allowed before it throws an error of type `TooManyWarnings`. 
   
   This PR makes it so that if the user disallows warnings of a certain type i.e. the limit is 0 (or is executing in `strict` mode), instead of throwing an error of type `TooManyWarnings`, we can directly surface the warning as the error, saving the user from the hassle of going throw the warning reports.
   
   <hr>
   
   ##### Key changed/added classes in this PR
    * `ControllerImpl`
   
   
   <hr>
   
   
   This PR has:
   - [ ] been self-reviewed.
      - [ ] using the [concurrency checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md) (Remove this item if the PR doesn't have any relation to concurrency.)
   - [ ] added documentation for new or modified features or behaviors.
   - [ ] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
   - [ ] added or updated version, license, or notice information in [licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md)
   - [ ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
   - [ ] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met.
   - [ ] added integration tests.
   - [ ] been tested in a test Druid cluster.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] cryptoe commented on a diff in pull request #13198: MSQ: Report the warning directly as an error if none of it is allowed by the user

Posted by GitBox <gi...@apache.org>.
cryptoe commented on code in PR #13198:
URL: https://github.com/apache/druid/pull/13198#discussion_r992954320


##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java:
##########
@@ -642,12 +642,22 @@ public void updateCounters(CounterSnapshotsTree snapshotsTree)
       // Present means the warning limit was exceeded, and warnings have therefore turned into an error.
       String errorCode = warningsExceeded.get().lhs;
       Long limit = warningsExceeded.get().rhs;
-      workerError(MSQErrorReport.fromFault(
-          id(),
-          selfDruidNode.getHost(),
-          null,
-          new TooManyWarningsFault(limit.intValue(), errorCode)
-      ));
+
+      if (limit == 0L) {
+        for (MSQErrorReport errorReport1 : workerWarnings) {
+          if (errorReport1.getFault().getErrorCode().equalsIgnoreCase(errorCode)) {
+            workerError(errorReport1);
+            break;
+          }
+        }

Review Comment:
   I was thinking, let the worker impl directly hit worker error endpoint in case the limit is 0 wdyt ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] LakshSingla commented on a diff in pull request #13198: MSQ: Report the warning directly as an error if none of it is allowed by the user

Posted by GitBox <gi...@apache.org>.
LakshSingla commented on code in PR #13198:
URL: https://github.com/apache/druid/pull/13198#discussion_r992956409


##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java:
##########
@@ -642,12 +642,22 @@ public void updateCounters(CounterSnapshotsTree snapshotsTree)
       // Present means the warning limit was exceeded, and warnings have therefore turned into an error.
       String errorCode = warningsExceeded.get().lhs;
       Long limit = warningsExceeded.get().rhs;
-      workerError(MSQErrorReport.fromFault(
-          id(),
-          selfDruidNode.getHost(),
-          null,
-          new TooManyWarningsFault(limit.intValue(), errorCode)
-      ));
+
+      if (limit == 0L) {
+        for (MSQErrorReport errorReport1 : workerWarnings) {
+          if (errorReport1.getFault().getErrorCode().equalsIgnoreCase(errorCode)) {
+            workerError(errorReport1);
+            break;
+          }
+        }

Review Comment:
   Trying out the changes, that seems like a much better alternative.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] LakshSingla commented on a diff in pull request #13198: MSQ: Report the warning directly as an error if none of it is allowed by the user

Posted by GitBox <gi...@apache.org>.
LakshSingla commented on code in PR #13198:
URL: https://github.com/apache/druid/pull/13198#discussion_r991839996


##########
extensions-core/multi-stage-query/src/test/java/org/apache/druid/msq/test/MSQTestBase.java:
##########
@@ -851,19 +858,28 @@ public void verifyResults()
       Preconditions.checkArgument(expectedDataSource != null, "dataSource cannot be null");
       Preconditions.checkArgument(expectedRowSignature != null, "expectedRowSignature cannot be null");
       Preconditions.checkArgument(
-          expectedResultRows != null || expectedMSQFault != null,
-          "atleast one of expectedResultRows or expectedMSQFault should be set to non null"
+          expectedResultRows != null || expectedMSQFault != null || expectedMSQFaultClass != null,
+          "atleast one of expectedResultRows, expectedMSQFault or expectedMSQFaultClass should be set to non null"
       );
       Preconditions.checkArgument(expectedShardSpec != null, "shardSpecClass cannot be null");
       readyToRun();
       try {
         String controllerId = runMultiStageQuery(sql, queryContext);
-        if (expectedMSQFault != null) {
+        if (expectedMSQFault != null || expectedMSQFaultClass != null) {
           MSQErrorReport msqErrorReport = getErrorReportOrThrow(controllerId);
-          Assert.assertEquals(
-              expectedMSQFault.getCodeWithMessage(),
-              msqErrorReport.getFault().getCodeWithMessage()
-          );
+          if (expectedMSQFault != null) {
+            Assert.assertEquals(
+                expectedMSQFault.getCodeWithMessage(),
+                msqErrorReport.getFault().getCodeWithMessage()
+            );
+          }
+          if (expectedMSQFaultClass != null) {

Review Comment:
   With the unparseable exceptions, the exception message contained nonstandard characters (from the erroneous line) which I didn't want to use in the code file. Moreover, this would prevent tests from failing if the input file is changed. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] LakshSingla commented on a diff in pull request #13198: MSQ: Report the warning directly as an error if none of it is allowed by the user

Posted by GitBox <gi...@apache.org>.
LakshSingla commented on code in PR #13198:
URL: https://github.com/apache/druid/pull/13198#discussion_r991839996


##########
extensions-core/multi-stage-query/src/test/java/org/apache/druid/msq/test/MSQTestBase.java:
##########
@@ -851,19 +858,28 @@ public void verifyResults()
       Preconditions.checkArgument(expectedDataSource != null, "dataSource cannot be null");
       Preconditions.checkArgument(expectedRowSignature != null, "expectedRowSignature cannot be null");
       Preconditions.checkArgument(
-          expectedResultRows != null || expectedMSQFault != null,
-          "atleast one of expectedResultRows or expectedMSQFault should be set to non null"
+          expectedResultRows != null || expectedMSQFault != null || expectedMSQFaultClass != null,
+          "atleast one of expectedResultRows, expectedMSQFault or expectedMSQFaultClass should be set to non null"
       );
       Preconditions.checkArgument(expectedShardSpec != null, "shardSpecClass cannot be null");
       readyToRun();
       try {
         String controllerId = runMultiStageQuery(sql, queryContext);
-        if (expectedMSQFault != null) {
+        if (expectedMSQFault != null || expectedMSQFaultClass != null) {
           MSQErrorReport msqErrorReport = getErrorReportOrThrow(controllerId);
-          Assert.assertEquals(
-              expectedMSQFault.getCodeWithMessage(),
-              msqErrorReport.getFault().getCodeWithMessage()
-          );
+          if (expectedMSQFault != null) {
+            Assert.assertEquals(
+                expectedMSQFault.getCodeWithMessage(),
+                msqErrorReport.getFault().getCodeWithMessage()
+            );
+          }
+          if (expectedMSQFaultClass != null) {

Review Comment:
   With the unparseable exceptions, the exception message contained nonstandard characters (from the erroneous line) which I didn't want to use in the code file.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] cryptoe commented on a diff in pull request #13198: MSQ: Report the warning directly as an error if none of it is allowed by the user

Posted by GitBox <gi...@apache.org>.
cryptoe commented on code in PR #13198:
URL: https://github.com/apache/druid/pull/13198#discussion_r991329095


##########
extensions-core/multi-stage-query/src/test/java/org/apache/druid/msq/test/MSQTestBase.java:
##########
@@ -851,19 +858,28 @@ public void verifyResults()
       Preconditions.checkArgument(expectedDataSource != null, "dataSource cannot be null");
       Preconditions.checkArgument(expectedRowSignature != null, "expectedRowSignature cannot be null");
       Preconditions.checkArgument(
-          expectedResultRows != null || expectedMSQFault != null,
-          "atleast one of expectedResultRows or expectedMSQFault should be set to non null"
+          expectedResultRows != null || expectedMSQFault != null || expectedMSQFaultClass != null,
+          "atleast one of expectedResultRows, expectedMSQFault or expectedMSQFaultClass should be set to non null"
       );
       Preconditions.checkArgument(expectedShardSpec != null, "shardSpecClass cannot be null");
       readyToRun();
       try {
         String controllerId = runMultiStageQuery(sql, queryContext);
-        if (expectedMSQFault != null) {
+        if (expectedMSQFault != null || expectedMSQFaultClass != null) {
           MSQErrorReport msqErrorReport = getErrorReportOrThrow(controllerId);
-          Assert.assertEquals(
-              expectedMSQFault.getCodeWithMessage(),
-              msqErrorReport.getFault().getCodeWithMessage()
-          );
+          if (expectedMSQFault != null) {
+            Assert.assertEquals(
+                expectedMSQFault.getCodeWithMessage(),
+                msqErrorReport.getFault().getCodeWithMessage()
+            );
+          }
+          if (expectedMSQFaultClass != null) {

Review Comment:
   Shouldn't we always assert the MSQ fault? 
   Is there any specific reason to assert on the MSQ class?



##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java:
##########
@@ -642,12 +642,22 @@ public void updateCounters(CounterSnapshotsTree snapshotsTree)
       // Present means the warning limit was exceeded, and warnings have therefore turned into an error.
       String errorCode = warningsExceeded.get().lhs;
       Long limit = warningsExceeded.get().rhs;
-      workerError(MSQErrorReport.fromFault(
-          id(),
-          selfDruidNode.getHost(),
-          null,
-          new TooManyWarningsFault(limit.intValue(), errorCode)
-      ));
+
+      if (limit == 0L) {
+        for (MSQErrorReport errorReport1 : workerWarnings) {

Review Comment:
   nit: I think we should rename the variable to warningReport. 



##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java:
##########
@@ -642,12 +642,22 @@ public void updateCounters(CounterSnapshotsTree snapshotsTree)
       // Present means the warning limit was exceeded, and warnings have therefore turned into an error.
       String errorCode = warningsExceeded.get().lhs;
       Long limit = warningsExceeded.get().rhs;
-      workerError(MSQErrorReport.fromFault(
-          id(),
-          selfDruidNode.getHost(),
-          null,
-          new TooManyWarningsFault(limit.intValue(), errorCode)
-      ));
+
+      if (limit == 0L) {
+        for (MSQErrorReport errorReport1 : workerWarnings) {
+          if (errorReport1.getFault().getErrorCode().equalsIgnoreCase(errorCode)) {
+            workerError(errorReport1);
+            break;
+          }
+        }

Review Comment:
   We should also throw an illegal state exception if we donot find the error code in the workerWarnings. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] cryptoe commented on a diff in pull request #13198: MSQ: Report the warning directly as an error if none of it is allowed by the user

Posted by GitBox <gi...@apache.org>.
cryptoe commented on code in PR #13198:
URL: https://github.com/apache/druid/pull/13198#discussion_r992953017


##########
extensions-core/multi-stage-query/src/test/java/org/apache/druid/msq/test/MSQTestBase.java:
##########
@@ -851,19 +858,28 @@ public void verifyResults()
       Preconditions.checkArgument(expectedDataSource != null, "dataSource cannot be null");
       Preconditions.checkArgument(expectedRowSignature != null, "expectedRowSignature cannot be null");
       Preconditions.checkArgument(
-          expectedResultRows != null || expectedMSQFault != null,
-          "atleast one of expectedResultRows or expectedMSQFault should be set to non null"
+          expectedResultRows != null || expectedMSQFault != null || expectedMSQFaultClass != null,
+          "atleast one of expectedResultRows, expectedMSQFault or expectedMSQFaultClass should be set to non null"
       );
       Preconditions.checkArgument(expectedShardSpec != null, "shardSpecClass cannot be null");
       readyToRun();
       try {
         String controllerId = runMultiStageQuery(sql, queryContext);
-        if (expectedMSQFault != null) {
+        if (expectedMSQFault != null || expectedMSQFaultClass != null) {
           MSQErrorReport msqErrorReport = getErrorReportOrThrow(controllerId);
-          Assert.assertEquals(
-              expectedMSQFault.getCodeWithMessage(),
-              msqErrorReport.getFault().getCodeWithMessage()
-          );
+          if (expectedMSQFault != null) {
+            Assert.assertEquals(
+                expectedMSQFault.getCodeWithMessage(),
+                msqErrorReport.getFault().getCodeWithMessage()
+            );
+          }
+          if (expectedMSQFaultClass != null) {

Review Comment:
   We change the get message call at line 863 to be a pattern matcher. That way you need not code out the whole string. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] LakshSingla commented on a diff in pull request #13198: MSQ: Report the warning directly as an error if none of it is allowed by the user

Posted by GitBox <gi...@apache.org>.
LakshSingla commented on code in PR #13198:
URL: https://github.com/apache/druid/pull/13198#discussion_r994555455


##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/indexing/error/MSQWarningReportLimiterPublisher.java:
##########
@@ -35,35 +36,38 @@
 public class MSQWarningReportLimiterPublisher implements MSQWarningReportPublisher
 {
 
-  final MSQWarningReportPublisher delegate;
-  final long totalLimit;
-  final Map<String, Long> errorCodeToLimit;
-  final ConcurrentHashMap<String, Long> errorCodeToCurrentCount = new ConcurrentHashMap<>();
+  private final MSQWarningReportPublisher delegate;
+  private final long totalLimit;
+  private final Map<String, Long> errorCodeToVerboseCountLimit;

Review Comment:
   I think that the errorCodeToLimit was slightly misleading. This is because there can be more than those error codes can be present in the controller. However, I do get your point. Maybe I will add clearer javadocs so that this ambiguity isn't present and revert the change.  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] cryptoe commented on a diff in pull request #13198: MSQ: Report the warning directly as an error if none of it is allowed by the user

Posted by GitBox <gi...@apache.org>.
cryptoe commented on code in PR #13198:
URL: https://github.com/apache/druid/pull/13198#discussion_r994241988


##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/WorkerImpl.java:
##########
@@ -260,13 +264,36 @@ public Optional<MSQErrorReport> runTask(final Closer closer) throws Exception
       }
     });
 
+    long maxAllowedParseExceptions = Long.parseLong(task.getContext().getOrDefault(
+        MSQWarnings.CTX_MAX_PARSE_EXCEPTIONS_ALLOWED,
+        Long.MAX_VALUE
+    ).toString());
+
+    long maxVerboseParseExceptions;
+    if (maxAllowedParseExceptions == -1L) {
+      maxVerboseParseExceptions = Limits.MAX_VERBOSE_PARSE_EXCEPTIONS;
+    } else {
+      maxVerboseParseExceptions = Math.min(maxAllowedParseExceptions, Limits.MAX_VERBOSE_PARSE_EXCEPTIONS);
+    }
+
+    Set<String> disallowedWarningCode = ImmutableSet.of();
+    if (maxAllowedParseExceptions == 0) {
+      disallowedWarningCode = ImmutableSet.of(CannotParseExternalDataFault.CODE);
+    }
+
     final MSQWarningReportPublisher msqWarningReportPublisher = new MSQWarningReportLimiterPublisher(
         new MSQWarningReportSimplePublisher(
             id(),
             controllerClient,
             id(),
             MSQTasks.getHostFromSelfNode(selfDruidNode)
-        )
+        ),
+        Limits.MAX_VERBOSE_WARNINGS,
+        ImmutableMap.of(CannotParseExternalDataFault.CODE, maxVerboseParseExceptions),
+        disallowedWarningCode,
+        controllerClient,
+        id(),

Review Comment:
   nit: id, host,  controllerClient should be the first arguments



##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/WorkerImpl.java:
##########
@@ -260,13 +264,36 @@ public Optional<MSQErrorReport> runTask(final Closer closer) throws Exception
       }
     });
 
+    long maxAllowedParseExceptions = Long.parseLong(task.getContext().getOrDefault(
+        MSQWarnings.CTX_MAX_PARSE_EXCEPTIONS_ALLOWED,
+        Long.MAX_VALUE
+    ).toString());
+
+    long maxVerboseParseExceptions;
+    if (maxAllowedParseExceptions == -1L) {
+      maxVerboseParseExceptions = Limits.MAX_VERBOSE_PARSE_EXCEPTIONS;
+    } else {
+      maxVerboseParseExceptions = Math.min(maxAllowedParseExceptions, Limits.MAX_VERBOSE_PARSE_EXCEPTIONS);
+    }
+
+    Set<String> disallowedWarningCode = ImmutableSet.of();

Review Comment:
   Nit: 
   you can just set this to null and initialize it in the else part. 
   Also, we can rename this to critical error codes



##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/indexing/error/MSQWarningReportLimiterPublisher.java:
##########
@@ -35,35 +36,38 @@
 public class MSQWarningReportLimiterPublisher implements MSQWarningReportPublisher
 {
 
-  final MSQWarningReportPublisher delegate;
-  final long totalLimit;
-  final Map<String, Long> errorCodeToLimit;
-  final ConcurrentHashMap<String, Long> errorCodeToCurrentCount = new ConcurrentHashMap<>();
+  private final MSQWarningReportPublisher delegate;
+  private final long totalLimit;
+  private final Map<String, Long> errorCodeToVerboseCountLimit;
+  private final Set<String> disallowedWarningCode;
+  private final ConcurrentHashMap<String, Long> errorCodeToCurrentCount = new ConcurrentHashMap<>();
+  private final ControllerClient controllerClient;
+  private final String workerId;
+
+  @Nullable
+  private final String host;
 
   long totalCount = 0L;
 
   final Object lock = new Object();
 
-  public MSQWarningReportLimiterPublisher(MSQWarningReportPublisher delegate)
-  {
-    this(
-        delegate,
-        Limits.MAX_VERBOSE_WARNINGS,
-        ImmutableMap.of(
-            CannotParseExternalDataFault.CODE, Limits.MAX_VERBOSE_PARSE_EXCEPTIONS
-        )
-    );
-  }
-
   public MSQWarningReportLimiterPublisher(

Review Comment:
   I think its about time to add a java doc here.



##########
extensions-core/multi-stage-query/src/test/java/org/apache/druid/msq/test/MSQTestBase.java:
##########
@@ -851,19 +858,28 @@ public void verifyResults()
       Preconditions.checkArgument(expectedDataSource != null, "dataSource cannot be null");
       Preconditions.checkArgument(expectedRowSignature != null, "expectedRowSignature cannot be null");
       Preconditions.checkArgument(
-          expectedResultRows != null || expectedMSQFault != null,
-          "atleast one of expectedResultRows or expectedMSQFault should be set to non null"
+          expectedResultRows != null || expectedMSQFault != null || expectedMSQFaultClass != null,
+          "atleast one of expectedResultRows, expectedMSQFault or expectedMSQFaultClass should be set to non null"
       );
       Preconditions.checkArgument(expectedShardSpec != null, "shardSpecClass cannot be null");
       readyToRun();
       try {
         String controllerId = runMultiStageQuery(sql, queryContext);
-        if (expectedMSQFault != null) {
+        if (expectedMSQFault != null || expectedMSQFaultClass != null) {
           MSQErrorReport msqErrorReport = getErrorReportOrThrow(controllerId);
-          Assert.assertEquals(
-              expectedMSQFault.getCodeWithMessage(),
-              msqErrorReport.getFault().getCodeWithMessage()
-          );
+          if (expectedMSQFault != null) {
+            Assert.assertEquals(
+                expectedMSQFault.getCodeWithMessage(),
+                msqErrorReport.getFault().getCodeWithMessage()
+            );
+          }
+          if (expectedMSQFaultClass != null) {

Review Comment:
   This got me thinking, if we have a malformed line, we can have a parseException with x mb's  of input. 
   Multiple such lines would blow up our report.
   Should we chomp the input to lets day 300 bytes or something in ParseException#75 
   cc @jon-wei 



##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/indexing/error/MSQWarningReportLimiterPublisher.java:
##########
@@ -74,12 +78,22 @@ public void publishException(int stageNumber, Throwable e)
       totalCount = totalCount + 1;
       errorCodeToCurrentCount.compute(errorCode, (ignored, count) -> count == null ? 1L : count + 1);
 
+      // Send the warning as an error if it is disallowed altogether
+      if (disallowedWarningCode.contains(errorCode)) {
+        try {
+          controllerClient.postWorkerError(workerId, MSQErrorReport.fromException(workerId, host, stageNumber, e));
+        }
+        catch (IOException e2) {

Review Comment:
   Can we throw a new RE with the message `failed to post the worker error xyz to the controller`. That way the worker will get terminated with a relevant error. 



##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/indexing/error/MSQWarningReportLimiterPublisher.java:
##########
@@ -35,35 +36,38 @@
 public class MSQWarningReportLimiterPublisher implements MSQWarningReportPublisher
 {
 
-  final MSQWarningReportPublisher delegate;
-  final long totalLimit;
-  final Map<String, Long> errorCodeToLimit;
-  final ConcurrentHashMap<String, Long> errorCodeToCurrentCount = new ConcurrentHashMap<>();
+  private final MSQWarningReportPublisher delegate;
+  private final long totalLimit;
+  private final Map<String, Long> errorCodeToVerboseCountLimit;

Review Comment:
   nit: errorCodeToLimit seems like a better variable to me. 
   we really do not need verboseCountLimt as that's a concept outside this class. 
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] LakshSingla commented on a diff in pull request #13198: MSQ: Report the warning directly as an error if none of it is allowed by the user

Posted by GitBox <gi...@apache.org>.
LakshSingla commented on code in PR #13198:
URL: https://github.com/apache/druid/pull/13198#discussion_r994577370


##########
extensions-core/multi-stage-query/src/test/java/org/apache/druid/msq/test/MSQTestBase.java:
##########
@@ -851,19 +858,28 @@ public void verifyResults()
       Preconditions.checkArgument(expectedDataSource != null, "dataSource cannot be null");
       Preconditions.checkArgument(expectedRowSignature != null, "expectedRowSignature cannot be null");
       Preconditions.checkArgument(
-          expectedResultRows != null || expectedMSQFault != null,
-          "atleast one of expectedResultRows or expectedMSQFault should be set to non null"
+          expectedResultRows != null || expectedMSQFault != null || expectedMSQFaultClass != null,
+          "atleast one of expectedResultRows, expectedMSQFault or expectedMSQFaultClass should be set to non null"
       );
       Preconditions.checkArgument(expectedShardSpec != null, "shardSpecClass cannot be null");
       readyToRun();
       try {
         String controllerId = runMultiStageQuery(sql, queryContext);
-        if (expectedMSQFault != null) {
+        if (expectedMSQFault != null || expectedMSQFaultClass != null) {
           MSQErrorReport msqErrorReport = getErrorReportOrThrow(controllerId);
-          Assert.assertEquals(
-              expectedMSQFault.getCodeWithMessage(),
-              msqErrorReport.getFault().getCodeWithMessage()
-          );
+          if (expectedMSQFault != null) {
+            Assert.assertEquals(
+                expectedMSQFault.getCodeWithMessage(),
+                msqErrorReport.getFault().getCodeWithMessage()
+            );
+          }
+          if (expectedMSQFaultClass != null) {

Review Comment:
   Stack traces of innocuous warnings and errors can also cause a similar issue. Should this be taken up as a separate PR? Also, 300 bytes seem less to me, something like 4KB might work better, wdyt? (I think we should be fine with an even larger limit as the number of warnings sent to the controller are limited). 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] cryptoe commented on a diff in pull request #13198: MSQ: Report the warning directly as an error if none of it is allowed by the user

Posted by GitBox <gi...@apache.org>.
cryptoe commented on code in PR #13198:
URL: https://github.com/apache/druid/pull/13198#discussion_r995344334


##########
extensions-core/multi-stage-query/src/test/java/org/apache/druid/msq/test/MSQTestBase.java:
##########
@@ -851,19 +858,28 @@ public void verifyResults()
       Preconditions.checkArgument(expectedDataSource != null, "dataSource cannot be null");
       Preconditions.checkArgument(expectedRowSignature != null, "expectedRowSignature cannot be null");
       Preconditions.checkArgument(
-          expectedResultRows != null || expectedMSQFault != null,
-          "atleast one of expectedResultRows or expectedMSQFault should be set to non null"
+          expectedResultRows != null || expectedMSQFault != null || expectedMSQFaultClass != null,
+          "atleast one of expectedResultRows, expectedMSQFault or expectedMSQFaultClass should be set to non null"
       );
       Preconditions.checkArgument(expectedShardSpec != null, "shardSpecClass cannot be null");
       readyToRun();
       try {
         String controllerId = runMultiStageQuery(sql, queryContext);
-        if (expectedMSQFault != null) {
+        if (expectedMSQFault != null || expectedMSQFaultClass != null) {
           MSQErrorReport msqErrorReport = getErrorReportOrThrow(controllerId);
-          Assert.assertEquals(
-              expectedMSQFault.getCodeWithMessage(),
-              msqErrorReport.getFault().getCodeWithMessage()
-          );
+          if (expectedMSQFault != null) {
+            Assert.assertEquals(
+                expectedMSQFault.getCodeWithMessage(),
+                msqErrorReport.getFault().getCodeWithMessage()
+            );
+          }
+          if (expectedMSQFaultClass != null) {

Review Comment:
   Yup We can do that outside this PR. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 merged pull request #13198: MSQ: Report the warning directly as an error if none of it is allowed by the user

Posted by GitBox <gi...@apache.org>.
abhishekagarwal87 merged PR #13198:
URL: https://github.com/apache/druid/pull/13198


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] LakshSingla commented on a diff in pull request #13198: MSQ: Report the warning directly as an error if none of it is allowed by the user

Posted by GitBox <gi...@apache.org>.
LakshSingla commented on code in PR #13198:
URL: https://github.com/apache/druid/pull/13198#discussion_r991842746


##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java:
##########
@@ -642,12 +642,22 @@ public void updateCounters(CounterSnapshotsTree snapshotsTree)
       // Present means the warning limit was exceeded, and warnings have therefore turned into an error.
       String errorCode = warningsExceeded.get().lhs;
       Long limit = warningsExceeded.get().rhs;
-      workerError(MSQErrorReport.fromFault(
-          id(),
-          selfDruidNode.getHost(),
-          null,
-          new TooManyWarningsFault(limit.intValue(), errorCode)
-      ));
+
+      if (limit == 0L) {
+        for (MSQErrorReport errorReport1 : workerWarnings) {
+          if (errorReport1.getFault().getErrorCode().equalsIgnoreCase(errorCode)) {
+            workerError(errorReport1);
+            break;
+          }
+        }

Review Comment:
   Makes sense, or else the controller would fail to stop.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] cryptoe commented on a diff in pull request #13198: MSQ: Report the warning directly as an error if none of it is allowed by the user

Posted by GitBox <gi...@apache.org>.
cryptoe commented on code in PR #13198:
URL: https://github.com/apache/druid/pull/13198#discussion_r992953222


##########
extensions-core/multi-stage-query/src/test/java/org/apache/druid/msq/test/MSQTestBase.java:
##########
@@ -851,19 +858,28 @@ public void verifyResults()
       Preconditions.checkArgument(expectedDataSource != null, "dataSource cannot be null");
       Preconditions.checkArgument(expectedRowSignature != null, "expectedRowSignature cannot be null");
       Preconditions.checkArgument(
-          expectedResultRows != null || expectedMSQFault != null,
-          "atleast one of expectedResultRows or expectedMSQFault should be set to non null"
+          expectedResultRows != null || expectedMSQFault != null || expectedMSQFaultClass != null,
+          "atleast one of expectedResultRows, expectedMSQFault or expectedMSQFaultClass should be set to non null"
       );
       Preconditions.checkArgument(expectedShardSpec != null, "shardSpecClass cannot be null");
       readyToRun();
       try {
         String controllerId = runMultiStageQuery(sql, queryContext);
-        if (expectedMSQFault != null) {
+        if (expectedMSQFault != null || expectedMSQFaultClass != null) {
           MSQErrorReport msqErrorReport = getErrorReportOrThrow(controllerId);
-          Assert.assertEquals(
-              expectedMSQFault.getCodeWithMessage(),
-              msqErrorReport.getFault().getCodeWithMessage()
-          );
+          if (expectedMSQFault != null) {
+            Assert.assertEquals(
+                expectedMSQFault.getCodeWithMessage(),
+                msqErrorReport.getFault().getCodeWithMessage()
+            );
+          }
+          if (expectedMSQFaultClass != null) {

Review Comment:
   Or change the source test file. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] cryptoe commented on a diff in pull request #13198: MSQ: Report the warning directly as an error if none of it is allowed by the user

Posted by GitBox <gi...@apache.org>.
cryptoe commented on code in PR #13198:
URL: https://github.com/apache/druid/pull/13198#discussion_r997258744


##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/indexing/error/MSQWarningReportLimiterPublisher.java:
##########
@@ -35,35 +37,46 @@
 public class MSQWarningReportLimiterPublisher implements MSQWarningReportPublisher
 {
 
-  final MSQWarningReportPublisher delegate;
-  final long totalLimit;
-  final Map<String, Long> errorCodeToLimit;
-  final ConcurrentHashMap<String, Long> errorCodeToCurrentCount = new ConcurrentHashMap<>();
+  private final MSQWarningReportPublisher delegate;
+  private final long totalLimit;
+  private final Map<String, Long> errorCodeToLimit;
+  private final Set<String> criticalWarningCodes;
+  private final ConcurrentHashMap<String, Long> errorCodeToCurrentCount = new ConcurrentHashMap<>();
+  private final ControllerClient controllerClient;
+  private final String workerId;
+
+  @Nullable
+  private final String host;
 
   long totalCount = 0L;
 
   final Object lock = new Object();
 
-  public MSQWarningReportLimiterPublisher(MSQWarningReportPublisher delegate)
-  {
-    this(
-        delegate,
-        Limits.MAX_VERBOSE_WARNINGS,
-        ImmutableMap.of(
-            CannotParseExternalDataFault.CODE, Limits.MAX_VERBOSE_PARSE_EXCEPTIONS
-        )
-    );
-  }
-
+  /**
+   * Creates a publisher which publishes the warnings to the controller if they have not yet exceeded the allowed limit.
+   * Moreover, if a warning is disallowed, i.e. it's limit is set to 0, then the publisher directly reports the warning

Review Comment:
   Nit: It would be much better if we could explain each variable. We really do not need to mention the tooManyWarningsFault piece as it's just confusing.



##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/indexing/error/MSQWarningReportLimiterPublisher.java:
##########
@@ -74,12 +78,22 @@ public void publishException(int stageNumber, Throwable e)
       totalCount = totalCount + 1;
       errorCodeToCurrentCount.compute(errorCode, (ignored, count) -> count == null ? 1L : count + 1);
 
+      // Send the warning as an error if it is disallowed altogether
+      if (disallowedWarningCode.contains(errorCode)) {
+        try {
+          controllerClient.postWorkerError(workerId, MSQErrorReport.fromException(workerId, host, stageNumber, e));
+        }
+        catch (IOException e2) {

Review Comment:
   Nit: rename from e2 to postException?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] cryptoe commented on a diff in pull request #13198: MSQ: Report the warning directly as an error if none of it is allowed by the user

Posted by GitBox <gi...@apache.org>.
cryptoe commented on code in PR #13198:
URL: https://github.com/apache/druid/pull/13198#discussion_r995344334


##########
extensions-core/multi-stage-query/src/test/java/org/apache/druid/msq/test/MSQTestBase.java:
##########
@@ -851,19 +858,28 @@ public void verifyResults()
       Preconditions.checkArgument(expectedDataSource != null, "dataSource cannot be null");
       Preconditions.checkArgument(expectedRowSignature != null, "expectedRowSignature cannot be null");
       Preconditions.checkArgument(
-          expectedResultRows != null || expectedMSQFault != null,
-          "atleast one of expectedResultRows or expectedMSQFault should be set to non null"
+          expectedResultRows != null || expectedMSQFault != null || expectedMSQFaultClass != null,
+          "atleast one of expectedResultRows, expectedMSQFault or expectedMSQFaultClass should be set to non null"
       );
       Preconditions.checkArgument(expectedShardSpec != null, "shardSpecClass cannot be null");
       readyToRun();
       try {
         String controllerId = runMultiStageQuery(sql, queryContext);
-        if (expectedMSQFault != null) {
+        if (expectedMSQFault != null || expectedMSQFaultClass != null) {
           MSQErrorReport msqErrorReport = getErrorReportOrThrow(controllerId);
-          Assert.assertEquals(
-              expectedMSQFault.getCodeWithMessage(),
-              msqErrorReport.getFault().getCodeWithMessage()
-          );
+          if (expectedMSQFault != null) {
+            Assert.assertEquals(
+                expectedMSQFault.getCodeWithMessage(),
+                msqErrorReport.getFault().getCodeWithMessage()
+            );
+          }
+          if (expectedMSQFaultClass != null) {

Review Comment:
   Yup We can do this outside this PR. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org