You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2022/07/14 17:41:10 UTC

[GitHub] [pinot] saurabhd336 opened a new pull request, #9058: Task genrator debug api

saurabhd336 opened a new pull request, #9058:
URL: https://github.com/apache/pinot/pull/9058

   Save task generator info into an inmemory map, and provide a controller API to fetch the details for a given table
   design doc: https://docs.google.com/document/d/1bCn13k57WLSaIAQGfATP75VS3p6V_ievlBi_HoOtmZ8/edit?usp=sharing


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] saurabhd336 commented on a diff in pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
saurabhd336 commented on code in PR #9058:
URL: https://github.com/apache/pinot/pull/9058#discussion_r931779946


##########
pinot-common/src/main/java/org/apache/pinot/common/minion/TaskGeneratorMostRecentRunInfo.java:
##########
@@ -0,0 +1,151 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.common.minion;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.PriorityQueue;
+import java.util.Queue;
+import java.util.TreeMap;
+import javax.annotation.Nonnull;
+import org.apache.helix.ZNRecord;
+
+
+/**
+ * a task generator running history which keeps the most recent several success run timestamp and the most recent
+ * several error run
+ * message.
+ */
+public class TaskGeneratorMostRecentRunInfo extends BaseTaskGeneratorInfo {
+  @VisibleForTesting
+  static final int MAX_NUM_OF_HISTORY_TO_KEEP = 5;
+  @VisibleForTesting
+  static final String MOST_RECENT_SUCCESS_RUN_TS = "mostRecentSuccessRunTs";
+  @VisibleForTesting
+  static final String MOST_RECENT_ERROR_RUN_MESSAGE = "mostRecentErrorRunMessage";
+
+  private final String _taskType;
+  private final String _tableNameWithType;
+  // the timestamp to error message map of the most recent several error runs
+  @Nonnull
+  private final TreeMap<Long, String> _mostRecentErrorRunMessage;
+  // the timestamp of the most recent several success runs
+  @Nonnull
+  private final List<Long> _mostRecentSuccessRunTS;
+  private final int _version;
+
+  private TaskGeneratorMostRecentRunInfo(String tableNameWithType, String taskType,
+      Map<Long, String> mostRecentErrorRunMessage, List<Long> mostRecentSuccessRunTS, int version) {
+    _tableNameWithType = tableNameWithType;
+    _taskType = taskType;
+    // sort and keep the most recent several error messages
+    _mostRecentErrorRunMessage = new TreeMap<>();
+    mostRecentErrorRunMessage.forEach((k, v) -> {
+      _mostRecentErrorRunMessage.put(k, v);
+      if (_mostRecentErrorRunMessage.size() > MAX_NUM_OF_HISTORY_TO_KEEP) {
+        _mostRecentErrorRunMessage.remove(_mostRecentErrorRunMessage.firstKey());
+      }
+    });
+    // sort and keep the most recent several timestamp
+    Queue<Long> sortedTs = new PriorityQueue<>();
+    mostRecentSuccessRunTS.forEach(e -> {
+      sortedTs.offer(e);
+      if (sortedTs.size() > MAX_NUM_OF_HISTORY_TO_KEEP) {
+        sortedTs.poll();
+      }
+    });
+    _mostRecentSuccessRunTS = new ArrayList<>();
+    while (!sortedTs.isEmpty()) {
+      _mostRecentSuccessRunTS.add(sortedTs.poll());
+    }
+    _version = version;
+  }
+
+  /**
+   * Returns the table name with type
+   */
+  public String getTableNameWithType() {
+    return _tableNameWithType;
+  }
+
+  @Override
+  public String getTaskType() {
+    return _taskType;
+  }
+
+  /**
+   * Gets the timestamp to error message map of the most recent several error runs
+   */
+  public TreeMap<Long, String> getMostRecentErrorRunMessage() {
+    return _mostRecentErrorRunMessage;
+  }
+
+  /**
+   * Adds an error run message
+   * @param ts A timestamp
+   * @param message An error message.
+   */
+  public void addErrorRunMessage(long ts, String message) {
+    _mostRecentErrorRunMessage.put(ts, message);
+    if (_mostRecentErrorRunMessage.size() > MAX_NUM_OF_HISTORY_TO_KEEP) {
+      _mostRecentErrorRunMessage.remove(_mostRecentErrorRunMessage.firstKey());
+    }
+  }
+
+  /**
+   * Gets the timestamp of the most recent several success runs
+   */
+  public List<Long> getMostRecentSuccessRunTS() {
+    return _mostRecentSuccessRunTS;
+  }
+
+  /**
+   * Adds a success task generating run timestamp
+   * @param ts A timestamp
+   */
+  public void addSuccessRunTs(long ts) {
+    _mostRecentSuccessRunTS.add(ts);

Review Comment:
   This only indicates the successful generation of a the task. This API is exclusively for task generation



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] saurabhd336 commented on a diff in pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
saurabhd336 commented on code in PR #9058:
URL: https://github.com/apache/pinot/pull/9058#discussion_r926262334


##########
pinot-common/src/main/java/org/apache/pinot/common/minion/TaskGeneratorMostRecentRunInfo.java:
##########
@@ -0,0 +1,151 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.common.minion;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.PriorityQueue;
+import java.util.Queue;
+import java.util.TreeMap;
+import javax.annotation.Nonnull;
+import org.apache.helix.ZNRecord;
+
+
+/**
+ * a task generator running history which keeps the most recent several success run timestamp and the most recent
+ * several error run
+ * message.
+ */
+public class TaskGeneratorMostRecentRunInfo extends BaseTaskGeneratorInfo {
+  @VisibleForTesting
+  static final int MAX_NUM_OF_HISTORY_TO_KEEP = 5;
+  @VisibleForTesting
+  static final String MOST_RECENT_SUCCESS_RUN_TS = "mostRecentSuccessRunTs";
+  @VisibleForTesting
+  static final String MOST_RECENT_ERROR_RUN_MESSAGE = "mostRecentErrorRunMessage";
+
+  private final String _taskType;
+  private final String _tableNameWithType;
+  // the timestamp to error message map of the most recent several error runs
+  @Nonnull
+  private final TreeMap<Long, String> _mostRecentErrorRunMessage;
+  // the timestamp of the most recent several success runs
+  @Nonnull
+  private final List<Long> _mostRecentSuccessRunTS;
+  private final int _version;
+
+  private TaskGeneratorMostRecentRunInfo(String tableNameWithType, String taskType,
+      Map<Long, String> mostRecentErrorRunMessage, List<Long> mostRecentSuccessRunTS, int version) {
+    _tableNameWithType = tableNameWithType;
+    _taskType = taskType;
+    // sort and keep the most recent several error messages
+    _mostRecentErrorRunMessage = new TreeMap<>();

Review Comment:
   Ack



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] sajjad-moradi commented on a diff in pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
sajjad-moradi commented on code in PR #9058:
URL: https://github.com/apache/pinot/pull/9058#discussion_r922894155


##########
pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotTaskRestletResource.java:
##########
@@ -231,6 +236,23 @@ public Map<String, PinotHelixTaskResourceManager.TaskDebugInfo> getTasksDebugInf
     return _pinotHelixTaskResourceManager.getTasksDebugInfoByTable(taskType, tableNameWithType, verbosity);
   }
 
+  @GET
+  @Path("/tasks/generator/{tableNameWithType}/{taskType}/debug")
+  @ApiOperation("Fetch task generation information for the recent runs of the given task for the given table")
+  public BaseTaskGeneratorInfo getTaskGenerationDebugInto(
+      @ApiParam(value = "Task type", required = true) @PathParam("taskType") String taskType,
+      @ApiParam(value = "Table name with type", required = true) @PathParam("tableNameWithType")
+          String tableNameWithType) {
+    BaseTaskGeneratorInfo
+        taskGeneratorMostRecentRunInfo = _taskManagerStatusCache.fetchTaskGeneratorInfo(tableNameWithType, taskType);

Review Comment:
   There are two parts:
   - generating task configurations which happens in pinot code
   - submitting the task via TaskDriver which is in Helix code
   
   Subbu has a valid point that helix should provide some info on failure on the 2nd part - task submission. The first part though is before reaching helix and it's good if we have an API to capture issues in task generation in pinot code.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] klsince commented on a diff in pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
klsince commented on code in PR #9058:
URL: https://github.com/apache/pinot/pull/9058#discussion_r922557908


##########
pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotTaskRestletResource.java:
##########
@@ -231,6 +236,23 @@ public Map<String, PinotHelixTaskResourceManager.TaskDebugInfo> getTasksDebugInf
     return _pinotHelixTaskResourceManager.getTasksDebugInfoByTable(taskType, tableNameWithType, verbosity);
   }
 
+  @GET
+  @Path("/tasks/generator/{tableNameWithType}/{taskType}/debug")
+  @ApiOperation("Fetch task generation information for the recent runs of the given task for the given table")
+  public BaseTaskGeneratorInfo getTaskGenerationDebugInto(
+      @ApiParam(value = "Task type", required = true) @PathParam("taskType") String taskType,
+      @ApiParam(value = "Table name with type", required = true) @PathParam("tableNameWithType")
+          String tableNameWithType) {
+    BaseTaskGeneratorInfo
+        taskGeneratorMostRecentRunInfo = _taskManagerStatusCache.fetchTaskGeneratorInfo(tableNameWithType, taskType);

Review Comment:
   for 2. I feel it's not very convenient when one auto scales minion workers frequently based on pending tasks. The debug API would be either not reachable when no workers are around or get no past task states to check when new workers are added 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] npawar commented on a diff in pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
npawar commented on code in PR #9058:
URL: https://github.com/apache/pinot/pull/9058#discussion_r921827074


##########
pinot-common/src/main/java/org/apache/pinot/common/minion/TaskGeneratorMostRecentRunInfo.java:
##########
@@ -0,0 +1,151 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.common.minion;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.PriorityQueue;
+import java.util.Queue;
+import java.util.TreeMap;
+import javax.annotation.Nonnull;
+import org.apache.helix.ZNRecord;
+
+
+/**
+ * a task generator running history which keeps the most recent several success run timestamp and the most recent
+ * several error run
+ * message.
+ */
+public class TaskGeneratorMostRecentRunInfo extends BaseTaskGeneratorInfo {
+  @VisibleForTesting
+  static final int MAX_NUM_OF_HISTORY_TO_KEEP = 5;
+  @VisibleForTesting
+  static final String MOST_RECENT_SUCCESS_RUN_TS = "mostRecentSuccessRunTs";
+  @VisibleForTesting
+  static final String MOST_RECENT_ERROR_RUN_MESSAGE = "mostRecentErrorRunMessage";
+
+  private final String _taskType;
+  private final String _tableNameWithType;
+  // the timestamp to error message map of the most recent several error runs
+  @Nonnull
+  private final TreeMap<Long, String> _mostRecentErrorRunMessage;
+  // the timestamp of the most recent several success runs
+  @Nonnull
+  private final List<Long> _mostRecentSuccessRunTS;
+  private final int _version;
+
+  private TaskGeneratorMostRecentRunInfo(String tableNameWithType, String taskType,
+      Map<Long, String> mostRecentErrorRunMessage, List<Long> mostRecentSuccessRunTS, int version) {
+    _tableNameWithType = tableNameWithType;
+    _taskType = taskType;
+    // sort and keep the most recent several error messages
+    _mostRecentErrorRunMessage = new TreeMap<>();

Review Comment:
   it seems this constructor is only ever called from within this same class, with empty list and map. Do we need all this sorting logic?



##########
pinot-common/src/main/java/org/apache/pinot/common/utils/helix/FakePropertyStore.java:
##########
@@ -37,7 +37,13 @@ public FakePropertyStore() {
 
   @Override
   public ZNRecord get(String path, Stat stat, int options) {
-    return _contents.get(path);
+    ZNRecord znRecord = _contents.get(path);

Review Comment:
   is this change needed?



##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/minion/PinotTaskManager.java:
##########
@@ -517,7 +526,34 @@ private synchronized Map<String, String> scheduleTasks(List<String> tableNamesWi
   private String scheduleTask(PinotTaskGenerator taskGenerator, List<TableConfig> enabledTableConfigs,
       boolean isLeader) {
     LOGGER.info("Trying to schedule task type: {}, isLeader: {}", taskGenerator.getTaskType(), isLeader);
-    List<PinotTaskConfig> pinotTaskConfigs = taskGenerator.generateTasks(enabledTableConfigs);
+    List<PinotTaskConfig> pinotTaskConfigs;
+    try {
+      pinotTaskConfigs = taskGenerator.generateTasks(enabledTableConfigs);
+      for (TableConfig tableConfig : enabledTableConfigs) {
+        try {
+          _taskManagerStatusCache.saveTaskGeneratorInfo(tableConfig.getTableName(), taskGenerator.getTaskType(),
+              taskGeneratorMostRecentRunInfo -> taskGeneratorMostRecentRunInfo.addSuccessRunTs(
+                  System.currentTimeMillis()));
+        } catch (Exception exception) {
+          LOGGER.warn("Failed to save task generator success timestamp to ZK", exception);
+        }
+      }
+    } catch (Exception e) {
+      for (TableConfig tableConfig : enabledTableConfigs) {
+        try {
+          StringWriter errors = new StringWriter();

Review Comment:
   it seems we will do the same thing for each table config, and then put the same stacktrace for every table?



##########
pinot-common/src/main/java/org/apache/pinot/common/minion/TaskGeneratorMostRecentRunInfo.java:
##########
@@ -0,0 +1,151 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.common.minion;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.PriorityQueue;
+import java.util.Queue;
+import java.util.TreeMap;
+import javax.annotation.Nonnull;
+import org.apache.helix.ZNRecord;
+
+
+/**
+ * a task generator running history which keeps the most recent several success run timestamp and the most recent
+ * several error run
+ * message.
+ */
+public class TaskGeneratorMostRecentRunInfo extends BaseTaskGeneratorInfo {
+  @VisibleForTesting
+  static final int MAX_NUM_OF_HISTORY_TO_KEEP = 5;
+  @VisibleForTesting
+  static final String MOST_RECENT_SUCCESS_RUN_TS = "mostRecentSuccessRunTs";
+  @VisibleForTesting
+  static final String MOST_RECENT_ERROR_RUN_MESSAGE = "mostRecentErrorRunMessage";
+
+  private final String _taskType;
+  private final String _tableNameWithType;
+  // the timestamp to error message map of the most recent several error runs
+  @Nonnull
+  private final TreeMap<Long, String> _mostRecentErrorRunMessage;
+  // the timestamp of the most recent several success runs
+  @Nonnull
+  private final List<Long> _mostRecentSuccessRunTS;
+  private final int _version;
+
+  private TaskGeneratorMostRecentRunInfo(String tableNameWithType, String taskType,
+      Map<Long, String> mostRecentErrorRunMessage, List<Long> mostRecentSuccessRunTS, int version) {
+    _tableNameWithType = tableNameWithType;
+    _taskType = taskType;
+    // sort and keep the most recent several error messages
+    _mostRecentErrorRunMessage = new TreeMap<>();
+    mostRecentErrorRunMessage.forEach((k, v) -> {
+      _mostRecentErrorRunMessage.put(k, v);
+      if (_mostRecentErrorRunMessage.size() > MAX_NUM_OF_HISTORY_TO_KEEP) {
+        _mostRecentErrorRunMessage.remove(_mostRecentErrorRunMessage.firstKey());
+      }
+    });
+    // sort and keep the most recent several timestamp
+    Queue<Long> sortedTs = new PriorityQueue<>();
+    mostRecentSuccessRunTS.forEach(e -> {
+      sortedTs.offer(e);
+      if (sortedTs.size() > MAX_NUM_OF_HISTORY_TO_KEEP) {
+        sortedTs.poll();
+      }
+    });
+    _mostRecentSuccessRunTS = new ArrayList<>();
+    while (!sortedTs.isEmpty()) {
+      _mostRecentSuccessRunTS.add(sortedTs.poll());
+    }
+    _version = version;
+  }
+
+  /**
+   * Returns the table name with type
+   */
+  public String getTableNameWithType() {
+    return _tableNameWithType;
+  }
+
+  @Override
+  public String getTaskType() {
+    return _taskType;
+  }
+
+  /**
+   * Gets the timestamp to error message map of the most recent several error runs
+   */
+  public TreeMap<Long, String> getMostRecentErrorRunMessage() {
+    return _mostRecentErrorRunMessage;
+  }
+
+  /**
+   * Adds an error run message
+   * @param ts A timestamp
+   * @param message An error message.
+   */
+  public void addErrorRunMessage(long ts, String message) {
+    _mostRecentErrorRunMessage.put(ts, message);
+    if (_mostRecentErrorRunMessage.size() > MAX_NUM_OF_HISTORY_TO_KEEP) {
+      _mostRecentErrorRunMessage.remove(_mostRecentErrorRunMessage.firstKey());
+    }
+  }
+
+  /**
+   * Gets the timestamp of the most recent several success runs
+   */
+  public List<Long> getMostRecentSuccessRunTS() {
+    return _mostRecentSuccessRunTS;
+  }
+
+  /**
+   * Adds a success task generating run timestamp
+   * @param ts A timestamp
+   */
+  public void addSuccessRunTs(long ts) {
+    _mostRecentSuccessRunTS.add(ts);
+    if (_mostRecentSuccessRunTS.size() > MAX_NUM_OF_HISTORY_TO_KEEP) {
+      // sort first in case the given timestamp is not the largest one.
+      Collections.sort(_mostRecentSuccessRunTS);
+      _mostRecentSuccessRunTS.remove(0);
+    }
+  }
+
+  /**
+   * Returns the current information version, it should be consistent with the corresponding {@link ZNRecord} version.
+   */
+  public int getVersion() {

Review Comment:
   prolly don't need this method either



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] saurabhd336 commented on a diff in pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
saurabhd336 commented on code in PR #9058:
URL: https://github.com/apache/pinot/pull/9058#discussion_r926387988


##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/minion/PinotTaskManager.java:
##########
@@ -517,7 +526,34 @@ private synchronized Map<String, String> scheduleTasks(List<String> tableNamesWi
   private String scheduleTask(PinotTaskGenerator taskGenerator, List<TableConfig> enabledTableConfigs,
       boolean isLeader) {
     LOGGER.info("Trying to schedule task type: {}, isLeader: {}", taskGenerator.getTaskType(), isLeader);
-    List<PinotTaskConfig> pinotTaskConfigs = taskGenerator.generateTasks(enabledTableConfigs);
+    List<PinotTaskConfig> pinotTaskConfigs;
+    try {
+      pinotTaskConfigs = taskGenerator.generateTasks(enabledTableConfigs);
+      for (TableConfig tableConfig : enabledTableConfigs) {

Review Comment:
   Had a discussion offline. Added the todo



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] saurabhd336 commented on a diff in pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
saurabhd336 commented on code in PR #9058:
URL: https://github.com/apache/pinot/pull/9058#discussion_r926263008


##########
pinot-common/src/main/java/org/apache/pinot/common/minion/TaskGeneratorMostRecentRunInfo.java:
##########
@@ -0,0 +1,151 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.common.minion;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.PriorityQueue;
+import java.util.Queue;
+import java.util.TreeMap;
+import javax.annotation.Nonnull;
+import org.apache.helix.ZNRecord;
+
+
+/**
+ * a task generator running history which keeps the most recent several success run timestamp and the most recent
+ * several error run
+ * message.
+ */
+public class TaskGeneratorMostRecentRunInfo extends BaseTaskGeneratorInfo {
+  @VisibleForTesting
+  static final int MAX_NUM_OF_HISTORY_TO_KEEP = 5;
+  @VisibleForTesting
+  static final String MOST_RECENT_SUCCESS_RUN_TS = "mostRecentSuccessRunTs";
+  @VisibleForTesting
+  static final String MOST_RECENT_ERROR_RUN_MESSAGE = "mostRecentErrorRunMessage";
+
+  private final String _taskType;
+  private final String _tableNameWithType;
+  // the timestamp to error message map of the most recent several error runs
+  @Nonnull
+  private final TreeMap<Long, String> _mostRecentErrorRunMessage;
+  // the timestamp of the most recent several success runs
+  @Nonnull
+  private final List<Long> _mostRecentSuccessRunTS;
+  private final int _version;
+
+  private TaskGeneratorMostRecentRunInfo(String tableNameWithType, String taskType,
+      Map<Long, String> mostRecentErrorRunMessage, List<Long> mostRecentSuccessRunTS, int version) {
+    _tableNameWithType = tableNameWithType;
+    _taskType = taskType;
+    // sort and keep the most recent several error messages
+    _mostRecentErrorRunMessage = new TreeMap<>();
+    mostRecentErrorRunMessage.forEach((k, v) -> {
+      _mostRecentErrorRunMessage.put(k, v);
+      if (_mostRecentErrorRunMessage.size() > MAX_NUM_OF_HISTORY_TO_KEEP) {
+        _mostRecentErrorRunMessage.remove(_mostRecentErrorRunMessage.firstKey());
+      }
+    });
+    // sort and keep the most recent several timestamp
+    Queue<Long> sortedTs = new PriorityQueue<>();
+    mostRecentSuccessRunTS.forEach(e -> {
+      sortedTs.offer(e);
+      if (sortedTs.size() > MAX_NUM_OF_HISTORY_TO_KEEP) {
+        sortedTs.poll();
+      }
+    });
+    _mostRecentSuccessRunTS = new ArrayList<>();
+    while (!sortedTs.isEmpty()) {
+      _mostRecentSuccessRunTS.add(sortedTs.poll());
+    }
+    _version = version;
+  }
+
+  /**
+   * Returns the table name with type
+   */
+  public String getTableNameWithType() {
+    return _tableNameWithType;
+  }
+
+  @Override
+  public String getTaskType() {
+    return _taskType;
+  }
+
+  /**
+   * Gets the timestamp to error message map of the most recent several error runs
+   */
+  public TreeMap<Long, String> getMostRecentErrorRunMessage() {
+    return _mostRecentErrorRunMessage;
+  }
+
+  /**
+   * Adds an error run message
+   * @param ts A timestamp
+   * @param message An error message.
+   */
+  public void addErrorRunMessage(long ts, String message) {
+    _mostRecentErrorRunMessage.put(ts, message);
+    if (_mostRecentErrorRunMessage.size() > MAX_NUM_OF_HISTORY_TO_KEEP) {
+      _mostRecentErrorRunMessage.remove(_mostRecentErrorRunMessage.firstKey());
+    }
+  }
+
+  /**
+   * Gets the timestamp of the most recent several success runs
+   */
+  public List<Long> getMostRecentSuccessRunTS() {
+    return _mostRecentSuccessRunTS;
+  }
+
+  /**
+   * Adds a success task generating run timestamp
+   * @param ts A timestamp
+   */
+  public void addSuccessRunTs(long ts) {
+    _mostRecentSuccessRunTS.add(ts);

Review Comment:
   Simplifies the eviction is all. Also the APi response is easier to work with (the most recent timestamp is the one of the top)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] saurabhd336 commented on pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
saurabhd336 commented on PR #9058:
URL: https://github.com/apache/pinot/pull/9058#issuecomment-1184728743

   @npawar @mcvsubbu I've raised this PR in place of the now closed PR https://github.com/apache/pinot/pull/9043


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter commented on pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on PR #9058:
URL: https://github.com/apache/pinot/pull/9058#issuecomment-1185342402

   # [Codecov](https://codecov.io/gh/apache/pinot/pull/9058?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#9058](https://codecov.io/gh/apache/pinot/pull/9058?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (fd0e6bd) into [master](https://codecov.io/gh/apache/pinot/commit/84478b645188174a9e56cf565f813dbf80a8acf3?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (84478b6) will **decrease** coverage by `54.71%`.
   > The diff coverage is `7.00%`.
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #9058       +/-   ##
   =============================================
   - Coverage     70.13%   15.41%   -54.72%     
   + Complexity     4741      170     -4571     
   =============================================
     Files          1831     1787       -44     
     Lines         96382    94373     -2009     
     Branches      14408    14178      -230     
   =============================================
   - Hits          67594    14548    -53046     
   - Misses        24125    78797    +54672     
   + Partials       4663     1028     -3635     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1 | `?` | |
   | integration2 | `?` | |
   | unittests1 | `?` | |
   | unittests2 | `15.41% <7.00%> (-0.01%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/pinot/pull/9058?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...che/pinot/common/minion/BaseTaskGeneratorInfo.java](https://codecov.io/gh/apache/pinot/pull/9058/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWluaW9uL0Jhc2VUYXNrR2VuZXJhdG9ySW5mby5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [.../common/minion/InMemoryTaskManagerStatusCache.java](https://codecov.io/gh/apache/pinot/pull/9058/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWluaW9uL0luTWVtb3J5VGFza01hbmFnZXJTdGF0dXNDYWNoZS5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [.../common/minion/TaskGeneratorMostRecentRunInfo.java](https://codecov.io/gh/apache/pinot/pull/9058/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWluaW9uL1Rhc2tHZW5lcmF0b3JNb3N0UmVjZW50UnVuSW5mby5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...he/pinot/common/minion/TaskManagerStatusCache.java](https://codecov.io/gh/apache/pinot/pull/9058/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWluaW9uL1Rhc2tNYW5hZ2VyU3RhdHVzQ2FjaGUuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...roller/api/resources/PinotTaskRestletResource.java](https://codecov.io/gh/apache/pinot/pull/9058/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9hcGkvcmVzb3VyY2VzL1Bpbm90VGFza1Jlc3RsZXRSZXNvdXJjZS5qYXZh) | `0.00% <0.00%> (-3.98%)` | :arrow_down: |
   | [...controller/helix/core/minion/PinotTaskManager.java](https://codecov.io/gh/apache/pinot/pull/9058/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9oZWxpeC9jb3JlL21pbmlvbi9QaW5vdFRhc2tNYW5hZ2VyLmphdmE=) | `36.44% <13.79%> (-30.86%)` | :arrow_down: |
   | [...apache/pinot/controller/BaseControllerStarter.java](https://codecov.io/gh/apache/pinot/pull/9058/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9CYXNlQ29udHJvbGxlclN0YXJ0ZXIuamF2YQ==) | `78.88% <100.00%> (-3.88%)` | :arrow_down: |
   | [...src/main/java/org/apache/pinot/sql/FilterKind.java](https://codecov.io/gh/apache/pinot/pull/9058/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcWwvRmlsdGVyS2luZC5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...ain/java/org/apache/pinot/core/data/table/Key.java](https://codecov.io/gh/apache/pinot/pull/9058/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9kYXRhL3RhYmxlL0tleS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../java/org/apache/pinot/spi/utils/BooleanUtils.java](https://codecov.io/gh/apache/pinot/pull/9058/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdXRpbHMvQm9vbGVhblV0aWxzLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | ... and [1394 more](https://codecov.io/gh/apache/pinot/pull/9058/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/pinot/pull/9058?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/pinot/pull/9058?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [84478b6...fd0e6bd](https://codecov.io/gh/apache/pinot/pull/9058?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] saurabhd336 commented on pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
saurabhd336 commented on PR #9058:
URL: https://github.com/apache/pinot/pull/9058#issuecomment-1189968285

   @mcvsubbu @sajjad-moradi @npawar @klsince I've tried addressing the issue with the debug data being present across different controllers by letting the main controller query every other controller for the status data.
   Please have a look at the updated PR


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] saurabhd336 commented on a diff in pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
saurabhd336 commented on code in PR #9058:
URL: https://github.com/apache/pinot/pull/9058#discussion_r926267115


##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/minion/PinotTaskManager.java:
##########
@@ -517,7 +526,34 @@ private synchronized Map<String, String> scheduleTasks(List<String> tableNamesWi
   private String scheduleTask(PinotTaskGenerator taskGenerator, List<TableConfig> enabledTableConfigs,
       boolean isLeader) {
     LOGGER.info("Trying to schedule task type: {}, isLeader: {}", taskGenerator.getTaskType(), isLeader);
-    List<PinotTaskConfig> pinotTaskConfigs = taskGenerator.generateTasks(enabledTableConfigs);
+    List<PinotTaskConfig> pinotTaskConfigs;
+    try {
+      pinotTaskConfigs = taskGenerator.generateTasks(enabledTableConfigs);
+      for (TableConfig tableConfig : enabledTableConfigs) {
+        try {
+          _taskManagerStatusCache.saveTaskGeneratorInfo(tableConfig.getTableName(), taskGenerator.getTaskType(),

Review Comment:
   With the in-memory storage, I don't see any exceptions that can be thrown. Have removed the try catch block



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] mcvsubbu commented on a diff in pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on code in PR #9058:
URL: https://github.com/apache/pinot/pull/9058#discussion_r922560474


##########
pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotTaskRestletResource.java:
##########
@@ -231,6 +236,23 @@ public Map<String, PinotHelixTaskResourceManager.TaskDebugInfo> getTasksDebugInf
     return _pinotHelixTaskResourceManager.getTasksDebugInfoByTable(taskType, tableNameWithType, verbosity);
   }
 
+  @GET
+  @Path("/tasks/generator/{tableNameWithType}/{taskType}/debug")
+  @ApiOperation("Fetch task generation information for the recent runs of the given task for the given table")
+  public BaseTaskGeneratorInfo getTaskGenerationDebugInto(
+      @ApiParam(value = "Task type", required = true) @PathParam("taskType") String taskType,
+      @ApiParam(value = "Table name with type", required = true) @PathParam("tableNameWithType")
+          String tableNameWithType) {
+    BaseTaskGeneratorInfo
+        taskGeneratorMostRecentRunInfo = _taskManagerStatusCache.fetchTaskGeneratorInfo(tableNameWithType, taskType);

Review Comment:
   But we have helix APIs for that, right? Has that avenue been fully explored? We are migrating to Helix 1.x) very soon (we are getting to production with that), so please check in 1.x pages what is available for task management and debugging, thanks. We will also do the same.
   cc: @sajjad-moradi 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] saurabhd336 commented on a diff in pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
saurabhd336 commented on code in PR #9058:
URL: https://github.com/apache/pinot/pull/9058#discussion_r926261345


##########
pinot-common/src/main/java/org/apache/pinot/common/minion/TaskGeneratorMostRecentRunInfo.java:
##########
@@ -0,0 +1,151 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.common.minion;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.PriorityQueue;
+import java.util.Queue;
+import java.util.TreeMap;
+import javax.annotation.Nonnull;
+import org.apache.helix.ZNRecord;
+
+
+/**
+ * a task generator running history which keeps the most recent several success run timestamp and the most recent
+ * several error run
+ * message.
+ */
+public class TaskGeneratorMostRecentRunInfo extends BaseTaskGeneratorInfo {
+  @VisibleForTesting
+  static final int MAX_NUM_OF_HISTORY_TO_KEEP = 5;
+  @VisibleForTesting
+  static final String MOST_RECENT_SUCCESS_RUN_TS = "mostRecentSuccessRunTs";
+  @VisibleForTesting
+  static final String MOST_RECENT_ERROR_RUN_MESSAGE = "mostRecentErrorRunMessage";
+
+  private final String _taskType;
+  private final String _tableNameWithType;
+  // the timestamp to error message map of the most recent several error runs
+  @Nonnull

Review Comment:
   Ack



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] saurabhd336 commented on a diff in pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
saurabhd336 commented on code in PR #9058:
URL: https://github.com/apache/pinot/pull/9058#discussion_r926261222


##########
pinot-common/src/main/java/org/apache/pinot/common/minion/TaskGeneratorMostRecentRunInfo.java:
##########
@@ -0,0 +1,151 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.common.minion;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.PriorityQueue;
+import java.util.Queue;
+import java.util.TreeMap;
+import javax.annotation.Nonnull;
+import org.apache.helix.ZNRecord;
+
+
+/**
+ * a task generator running history which keeps the most recent several success run timestamp and the most recent
+ * several error run
+ * message.
+ */
+public class TaskGeneratorMostRecentRunInfo extends BaseTaskGeneratorInfo {
+  @VisibleForTesting
+  static final int MAX_NUM_OF_HISTORY_TO_KEEP = 5;
+  @VisibleForTesting
+  static final String MOST_RECENT_SUCCESS_RUN_TS = "mostRecentSuccessRunTs";
+  @VisibleForTesting
+  static final String MOST_RECENT_ERROR_RUN_MESSAGE = "mostRecentErrorRunMessage";

Review Comment:
   This was being used when we had ZK based storage. Removed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] saurabhd336 commented on a diff in pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
saurabhd336 commented on code in PR #9058:
URL: https://github.com/apache/pinot/pull/9058#discussion_r924223574


##########
pinot-common/src/main/java/org/apache/pinot/common/minion/InMemoryTaskManagerStatusCache.java:
##########
@@ -0,0 +1,82 @@
+package org.apache.pinot.common.minion;
+
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.function.Consumer;
+
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one

Review Comment:
   Ack



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] klsince commented on a diff in pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
klsince commented on code in PR #9058:
URL: https://github.com/apache/pinot/pull/9058#discussion_r922317757


##########
pinot-common/src/main/java/org/apache/pinot/common/minion/InMemoryTaskManagerStatusCache.java:
##########
@@ -0,0 +1,82 @@
+package org.apache.pinot.common.minion;
+
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.function.Consumer;
+
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one

Review Comment:
   format: move this header to top



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] mcvsubbu commented on pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on PR #9058:
URL: https://github.com/apache/pinot/pull/9058#issuecomment-1185728870

   > > Overall, I am a plus 1 on more minion debuggability.
   > > My view is that we should add APIs to minions similar to that of servers, and pull debugging info from there and display it.
   > 
   > We already have minion APIs (look for /debug endpoint in Tasks tab). This is for scheduler errors, and the scope of scheduler remains limited to the controller, hence thought it makes most sense as an API here
   
   I don't see a minion debug REST endpoint in the code. Am I missing something?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] saurabhd336 commented on a diff in pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
saurabhd336 commented on code in PR #9058:
URL: https://github.com/apache/pinot/pull/9058#discussion_r927327386


##########
pinot-common/src/main/java/org/apache/pinot/common/minion/TaskManagerStatusCache.java:
##########
@@ -0,0 +1,32 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.common.minion;
+
+import java.util.function.Consumer;
+
+
+public abstract class TaskManagerStatusCache<T extends BaseTaskGeneratorInfo> {

Review Comment:
   Ack



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] sajjad-moradi commented on a diff in pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
sajjad-moradi commented on code in PR #9058:
URL: https://github.com/apache/pinot/pull/9058#discussion_r922894461


##########
pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotTaskRestletResource.java:
##########
@@ -231,6 +236,23 @@ public Map<String, PinotHelixTaskResourceManager.TaskDebugInfo> getTasksDebugInf
     return _pinotHelixTaskResourceManager.getTasksDebugInfoByTable(taskType, tableNameWithType, verbosity);
   }
 
+  @GET
+  @Path("/tasks/generator/{tableNameWithType}/{taskType}/debug")
+  @ApiOperation("Fetch task generation information for the recent runs of the given task for the given table")
+  public BaseTaskGeneratorInfo getTaskGenerationDebugInto(
+      @ApiParam(value = "Task type", required = true) @PathParam("taskType") String taskType,
+      @ApiParam(value = "Table name with type", required = true) @PathParam("tableNameWithType")
+          String tableNameWithType) {
+    BaseTaskGeneratorInfo
+        taskGeneratorMostRecentRunInfo = _taskManagerStatusCache.fetchTaskGeneratorInfo(tableNameWithType, taskType);

Review Comment:
   Also scheduling tasks can be done by either:
   - periodic task generator job on lead controller, or
   - task REST endpoint which can be on any controller
   
   So this means the approach in this PR doesn't work as expected.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] mcvsubbu commented on pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on PR #9058:
URL: https://github.com/apache/pinot/pull/9058#issuecomment-1184848283

   Overall, I am a plus 1 on more minion debuggability. 
   
   My view is that we should add APIs to minions similar to that of servers, and pull debugging info from there and display it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] saurabhd336 commented on a diff in pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
saurabhd336 commented on code in PR #9058:
URL: https://github.com/apache/pinot/pull/9058#discussion_r921924190


##########
pinot-common/src/main/java/org/apache/pinot/common/minion/TaskGeneratorMostRecentRunInfo.java:
##########
@@ -0,0 +1,151 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.common.minion;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.PriorityQueue;
+import java.util.Queue;
+import java.util.TreeMap;
+import javax.annotation.Nonnull;
+import org.apache.helix.ZNRecord;
+
+
+/**
+ * a task generator running history which keeps the most recent several success run timestamp and the most recent
+ * several error run
+ * message.
+ */
+public class TaskGeneratorMostRecentRunInfo extends BaseTaskGeneratorInfo {
+  @VisibleForTesting
+  static final int MAX_NUM_OF_HISTORY_TO_KEEP = 5;
+  @VisibleForTesting
+  static final String MOST_RECENT_SUCCESS_RUN_TS = "mostRecentSuccessRunTs";
+  @VisibleForTesting
+  static final String MOST_RECENT_ERROR_RUN_MESSAGE = "mostRecentErrorRunMessage";
+
+  private final String _taskType;
+  private final String _tableNameWithType;
+  // the timestamp to error message map of the most recent several error runs
+  @Nonnull
+  private final TreeMap<Long, String> _mostRecentErrorRunMessage;
+  // the timestamp of the most recent several success runs
+  @Nonnull
+  private final List<Long> _mostRecentSuccessRunTS;
+  private final int _version;
+
+  private TaskGeneratorMostRecentRunInfo(String tableNameWithType, String taskType,
+      Map<Long, String> mostRecentErrorRunMessage, List<Long> mostRecentSuccessRunTS, int version) {
+    _tableNameWithType = tableNameWithType;
+    _taskType = taskType;
+    // sort and keep the most recent several error messages
+    _mostRecentErrorRunMessage = new TreeMap<>();

Review Comment:
   If at all we move to persistent storage, we might need to serialize / deserialize instances of this class using an allArgsConstructor



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] klsince commented on a diff in pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
klsince commented on code in PR #9058:
URL: https://github.com/apache/pinot/pull/9058#discussion_r925891929


##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/minion/PinotTaskManager.java:
##########
@@ -517,7 +526,34 @@ private synchronized Map<String, String> scheduleTasks(List<String> tableNamesWi
   private String scheduleTask(PinotTaskGenerator taskGenerator, List<TableConfig> enabledTableConfigs,
       boolean isLeader) {
     LOGGER.info("Trying to schedule task type: {}, isLeader: {}", taskGenerator.getTaskType(), isLeader);
-    List<PinotTaskConfig> pinotTaskConfigs = taskGenerator.generateTasks(enabledTableConfigs);
+    List<PinotTaskConfig> pinotTaskConfigs;
+    try {
+      pinotTaskConfigs = taskGenerator.generateTasks(enabledTableConfigs);
+      for (TableConfig tableConfig : enabledTableConfigs) {
+        try {
+          _taskManagerStatusCache.saveTaskGeneratorInfo(tableConfig.getTableName(), taskGenerator.getTaskType(),

Review Comment:
   nit: add a saveTaskGeneratorInfoQuietly() to save the try-catch block for brevity. 



##########
pinot-common/src/main/java/org/apache/pinot/common/minion/TaskGeneratorMostRecentRunInfo.java:
##########
@@ -0,0 +1,151 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.common.minion;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.PriorityQueue;
+import java.util.Queue;
+import java.util.TreeMap;
+import javax.annotation.Nonnull;
+import org.apache.helix.ZNRecord;
+
+
+/**
+ * a task generator running history which keeps the most recent several success run timestamp and the most recent
+ * several error run
+ * message.
+ */
+public class TaskGeneratorMostRecentRunInfo extends BaseTaskGeneratorInfo {
+  @VisibleForTesting
+  static final int MAX_NUM_OF_HISTORY_TO_KEEP = 5;
+  @VisibleForTesting
+  static final String MOST_RECENT_SUCCESS_RUN_TS = "mostRecentSuccessRunTs";
+  @VisibleForTesting
+  static final String MOST_RECENT_ERROR_RUN_MESSAGE = "mostRecentErrorRunMessage";
+
+  private final String _taskType;
+  private final String _tableNameWithType;
+  // the timestamp to error message map of the most recent several error runs
+  @Nonnull
+  private final TreeMap<Long, String> _mostRecentErrorRunMessage;
+  // the timestamp of the most recent several success runs
+  @Nonnull
+  private final List<Long> _mostRecentSuccessRunTS;
+  private final int _version;
+
+  private TaskGeneratorMostRecentRunInfo(String tableNameWithType, String taskType,
+      Map<Long, String> mostRecentErrorRunMessage, List<Long> mostRecentSuccessRunTS, int version) {
+    _tableNameWithType = tableNameWithType;
+    _taskType = taskType;
+    // sort and keep the most recent several error messages
+    _mostRecentErrorRunMessage = new TreeMap<>();
+    mostRecentErrorRunMessage.forEach((k, v) -> {
+      _mostRecentErrorRunMessage.put(k, v);
+      if (_mostRecentErrorRunMessage.size() > MAX_NUM_OF_HISTORY_TO_KEEP) {
+        _mostRecentErrorRunMessage.remove(_mostRecentErrorRunMessage.firstKey());
+      }
+    });
+    // sort and keep the most recent several timestamp
+    Queue<Long> sortedTs = new PriorityQueue<>();
+    mostRecentSuccessRunTS.forEach(e -> {
+      sortedTs.offer(e);
+      if (sortedTs.size() > MAX_NUM_OF_HISTORY_TO_KEEP) {
+        sortedTs.poll();
+      }
+    });
+    _mostRecentSuccessRunTS = new ArrayList<>();
+    while (!sortedTs.isEmpty()) {
+      _mostRecentSuccessRunTS.add(sortedTs.poll());
+    }
+    _version = version;
+  }
+
+  /**
+   * Returns the table name with type
+   */
+  public String getTableNameWithType() {
+    return _tableNameWithType;
+  }
+
+  @Override
+  public String getTaskType() {
+    return _taskType;
+  }
+
+  /**
+   * Gets the timestamp to error message map of the most recent several error runs
+   */
+  public TreeMap<Long, String> getMostRecentErrorRunMessage() {
+    return _mostRecentErrorRunMessage;
+  }
+
+  /**
+   * Adds an error run message
+   * @param ts A timestamp
+   * @param message An error message.
+   */
+  public void addErrorRunMessage(long ts, String message) {
+    _mostRecentErrorRunMessage.put(ts, message);
+    if (_mostRecentErrorRunMessage.size() > MAX_NUM_OF_HISTORY_TO_KEEP) {
+      _mostRecentErrorRunMessage.remove(_mostRecentErrorRunMessage.firstKey());
+    }
+  }
+
+  /**
+   * Gets the timestamp of the most recent several success runs
+   */
+  public List<Long> getMostRecentSuccessRunTS() {
+    return _mostRecentSuccessRunTS;
+  }
+
+  /**
+   * Adds a success task generating run timestamp
+   * @param ts A timestamp
+   */
+  public void addSuccessRunTs(long ts) {
+    _mostRecentSuccessRunTS.add(ts);

Review Comment:
   Q: any need to keep _mostRecentSuccessRunTS sorted?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang commented on a diff in pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on code in PR #9058:
URL: https://github.com/apache/pinot/pull/9058#discussion_r926088454


##########
pinot-common/src/main/java/org/apache/pinot/common/minion/TaskGeneratorMostRecentRunInfo.java:
##########
@@ -0,0 +1,151 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.common.minion;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.PriorityQueue;
+import java.util.Queue;
+import java.util.TreeMap;
+import javax.annotation.Nonnull;
+import org.apache.helix.ZNRecord;
+
+
+/**
+ * a task generator running history which keeps the most recent several success run timestamp and the most recent
+ * several error run
+ * message.
+ */
+public class TaskGeneratorMostRecentRunInfo extends BaseTaskGeneratorInfo {
+  @VisibleForTesting
+  static final int MAX_NUM_OF_HISTORY_TO_KEEP = 5;
+  @VisibleForTesting
+  static final String MOST_RECENT_SUCCESS_RUN_TS = "mostRecentSuccessRunTs";
+  @VisibleForTesting
+  static final String MOST_RECENT_ERROR_RUN_MESSAGE = "mostRecentErrorRunMessage";

Review Comment:
   Seems not used?



##########
pinot-common/src/main/java/org/apache/pinot/common/minion/InMemoryTaskManagerStatusCache.java:
##########
@@ -0,0 +1,82 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.pinot.common.minion;
+
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.function.Consumer;
+
+public class InMemoryTaskManagerStatusCache extends TaskManagerStatusCache<TaskGeneratorMostRecentRunInfo> {
+
+  private static class TaskGeneratorCacheKey {
+    String _tableNameWithType;

Review Comment:
   (minor) make them `final`



##########
pinot-common/src/main/java/org/apache/pinot/common/minion/TaskGeneratorMostRecentRunInfo.java:
##########
@@ -0,0 +1,151 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.common.minion;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.PriorityQueue;
+import java.util.Queue;
+import java.util.TreeMap;
+import javax.annotation.Nonnull;
+import org.apache.helix.ZNRecord;
+
+
+/**
+ * a task generator running history which keeps the most recent several success run timestamp and the most recent
+ * several error run
+ * message.
+ */
+public class TaskGeneratorMostRecentRunInfo extends BaseTaskGeneratorInfo {
+  @VisibleForTesting
+  static final int MAX_NUM_OF_HISTORY_TO_KEEP = 5;
+  @VisibleForTesting
+  static final String MOST_RECENT_SUCCESS_RUN_TS = "mostRecentSuccessRunTs";
+  @VisibleForTesting
+  static final String MOST_RECENT_ERROR_RUN_MESSAGE = "mostRecentErrorRunMessage";
+
+  private final String _taskType;
+  private final String _tableNameWithType;
+  // the timestamp to error message map of the most recent several error runs
+  @Nonnull
+  private final TreeMap<Long, String> _mostRecentErrorRunMessage;
+  // the timestamp of the most recent several success runs
+  @Nonnull
+  private final List<Long> _mostRecentSuccessRunTS;
+  private final int _version;
+
+  private TaskGeneratorMostRecentRunInfo(String tableNameWithType, String taskType,
+      Map<Long, String> mostRecentErrorRunMessage, List<Long> mostRecentSuccessRunTS, int version) {
+    _tableNameWithType = tableNameWithType;
+    _taskType = taskType;
+    // sort and keep the most recent several error messages
+    _mostRecentErrorRunMessage = new TreeMap<>();

Review Comment:
   Since we decided to not use ZNode to store the information, suggest simplifying it to only take `tableNameWithType` and `taskType`. We may change it when we decide to persist the info in the future.



##########
pinot-common/src/main/java/org/apache/pinot/common/minion/TaskGeneratorMostRecentRunInfo.java:
##########
@@ -0,0 +1,151 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.common.minion;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.PriorityQueue;
+import java.util.Queue;
+import java.util.TreeMap;
+import javax.annotation.Nonnull;
+import org.apache.helix.ZNRecord;
+
+
+/**
+ * a task generator running history which keeps the most recent several success run timestamp and the most recent
+ * several error run
+ * message.

Review Comment:
   (minor)
   ```suggestion
    * several error run messages.
   ```



##########
pinot-common/src/main/java/org/apache/pinot/common/minion/TaskGeneratorMostRecentRunInfo.java:
##########
@@ -0,0 +1,151 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.common.minion;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.PriorityQueue;
+import java.util.Queue;
+import java.util.TreeMap;
+import javax.annotation.Nonnull;
+import org.apache.helix.ZNRecord;
+
+
+/**
+ * a task generator running history which keeps the most recent several success run timestamp and the most recent
+ * several error run
+ * message.
+ */
+public class TaskGeneratorMostRecentRunInfo extends BaseTaskGeneratorInfo {
+  @VisibleForTesting
+  static final int MAX_NUM_OF_HISTORY_TO_KEEP = 5;
+  @VisibleForTesting
+  static final String MOST_RECENT_SUCCESS_RUN_TS = "mostRecentSuccessRunTs";
+  @VisibleForTesting
+  static final String MOST_RECENT_ERROR_RUN_MESSAGE = "mostRecentErrorRunMessage";
+
+  private final String _taskType;
+  private final String _tableNameWithType;
+  // the timestamp to error message map of the most recent several error runs
+  @Nonnull
+  private final TreeMap<Long, String> _mostRecentErrorRunMessage;

Review Comment:
   Consider storing timestamp as `java.sql.Timestamp` so that the response can show human readable time



##########
pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotTaskRestletResource.java:
##########
@@ -231,6 +260,55 @@ public Map<String, PinotHelixTaskResourceManager.TaskDebugInfo> getTasksDebugInf
     return _pinotHelixTaskResourceManager.getTasksDebugInfoByTable(taskType, tableNameWithType, verbosity);
   }
 
+  @GET
+  @Produces(MediaType.APPLICATION_JSON)
+  @Path("/tasks/generator/{tableNameWithType}/{taskType}/debug")
+  @ApiOperation("Fetch task generation information for the recent runs of the given task for the given table")
+  public String getTaskGenerationDebugInto(
+      @ApiParam(value = "Task type", required = true) @PathParam("taskType") String taskType,
+      @ApiParam(value = "Table name with type", required = true) @PathParam("tableNameWithType")
+          String tableNameWithType,
+      @ApiParam(value = "Whether to only lookup local cache for logs", defaultValue = "false") @QueryParam("localOnly")
+          boolean localOnly)
+      throws JsonProcessingException {
+    if (localOnly) {
+      BaseTaskGeneratorInfo taskGeneratorMostRecentRunInfo =
+          _taskManagerStatusCache.fetchTaskGeneratorInfo(tableNameWithType, taskType);
+      if (taskGeneratorMostRecentRunInfo == null) {
+        throw new ControllerApplicationException(LOGGER, "Task generation information not found",
+            Response.Status.NOT_FOUND);
+      }
+
+      return JsonUtils.objectToString(taskGeneratorMostRecentRunInfo);
+    }
+
+    // Call all controllers
+    List<InstanceConfig> controllers = _pinotHelixResourceManager.getAllControllerInstanceConfigs();
+    // Relying on original schema that was used to query the controller
+    URI uri = _uriInfo.getRequestUri();
+    String scheme = uri.getScheme();
+    List<String> controllerUrls = controllers.stream().map(controller -> {
+      return String.format("%s://%s:%d/tasks/generator/%s/%s/debug?localOnly=true", scheme, controller.getHostName(),
+          Integer.parseInt(controller.getPort()), tableNameWithType, taskType);
+    }).collect(Collectors.toList());
+
+    CompletionServiceHelper completionServiceHelper =
+        new CompletionServiceHelper(_executor, _connectionManager, HashBiMap.create(0));
+    CompletionServiceHelper.CompletionServiceResponse serviceResponse =
+        completionServiceHelper.doMultiGetRequest(controllerUrls, null, true, 10000);
+
+    List<JsonNode> result = new ArrayList<>();
+    serviceResponse._httpResponses.values().forEach(resp -> {
+      try {
+        result.add(JsonUtils.stringToJsonNode(resp));
+      } catch (IOException e) {
+        LOGGER.error("Failed to parse ");

Review Comment:
   Incomplete



##########
pinot-common/src/main/java/org/apache/pinot/common/minion/TaskGeneratorMostRecentRunInfo.java:
##########
@@ -0,0 +1,151 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.common.minion;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.PriorityQueue;
+import java.util.Queue;
+import java.util.TreeMap;
+import javax.annotation.Nonnull;
+import org.apache.helix.ZNRecord;
+
+
+/**
+ * a task generator running history which keeps the most recent several success run timestamp and the most recent
+ * several error run
+ * message.
+ */
+public class TaskGeneratorMostRecentRunInfo extends BaseTaskGeneratorInfo {
+  @VisibleForTesting
+  static final int MAX_NUM_OF_HISTORY_TO_KEEP = 5;
+  @VisibleForTesting
+  static final String MOST_RECENT_SUCCESS_RUN_TS = "mostRecentSuccessRunTs";
+  @VisibleForTesting
+  static final String MOST_RECENT_ERROR_RUN_MESSAGE = "mostRecentErrorRunMessage";
+
+  private final String _taskType;
+  private final String _tableNameWithType;
+  // the timestamp to error message map of the most recent several error runs
+  @Nonnull
+  private final TreeMap<Long, String> _mostRecentErrorRunMessage;

Review Comment:
   ```suggestion
     private final TreeMap<Long, String> _mostRecentErrorRunMessages;
   ```



##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/minion/PinotTaskManager.java:
##########
@@ -517,7 +526,34 @@ private synchronized Map<String, String> scheduleTasks(List<String> tableNamesWi
   private String scheduleTask(PinotTaskGenerator taskGenerator, List<TableConfig> enabledTableConfigs,
       boolean isLeader) {
     LOGGER.info("Trying to schedule task type: {}, isLeader: {}", taskGenerator.getTaskType(), isLeader);
-    List<PinotTaskConfig> pinotTaskConfigs = taskGenerator.generateTasks(enabledTableConfigs);
+    List<PinotTaskConfig> pinotTaskConfigs;
+    try {
+      pinotTaskConfigs = taskGenerator.generateTasks(enabledTableConfigs);
+      for (TableConfig tableConfig : enabledTableConfigs) {

Review Comment:
   Let's add a `TODO` here to separate the generated task/exception for each table. Currently even though some table doesn't have anything scheduled (e.g. enough tasks are scheduled from previous tables), we will still put success TS for it



##########
pinot-common/src/main/java/org/apache/pinot/common/minion/TaskGeneratorMostRecentRunInfo.java:
##########
@@ -0,0 +1,151 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.common.minion;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.PriorityQueue;
+import java.util.Queue;
+import java.util.TreeMap;
+import javax.annotation.Nonnull;
+import org.apache.helix.ZNRecord;
+
+
+/**
+ * a task generator running history which keeps the most recent several success run timestamp and the most recent
+ * several error run
+ * message.
+ */
+public class TaskGeneratorMostRecentRunInfo extends BaseTaskGeneratorInfo {
+  @VisibleForTesting
+  static final int MAX_NUM_OF_HISTORY_TO_KEEP = 5;
+  @VisibleForTesting
+  static final String MOST_RECENT_SUCCESS_RUN_TS = "mostRecentSuccessRunTs";
+  @VisibleForTesting
+  static final String MOST_RECENT_ERROR_RUN_MESSAGE = "mostRecentErrorRunMessage";
+
+  private final String _taskType;
+  private final String _tableNameWithType;
+  // the timestamp to error message map of the most recent several error runs
+  @Nonnull

Review Comment:
   (minor) We don't usually use `Nonnull`. Everything without annotation are treated non-null



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] saurabhd336 commented on a diff in pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
saurabhd336 commented on code in PR #9058:
URL: https://github.com/apache/pinot/pull/9058#discussion_r926260568


##########
pinot-common/src/main/java/org/apache/pinot/common/minion/InMemoryTaskManagerStatusCache.java:
##########
@@ -0,0 +1,82 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.pinot.common.minion;
+
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.function.Consumer;
+
+public class InMemoryTaskManagerStatusCache extends TaskManagerStatusCache<TaskGeneratorMostRecentRunInfo> {
+
+  private static class TaskGeneratorCacheKey {
+    String _tableNameWithType;

Review Comment:
   Ack



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] mcvsubbu commented on a diff in pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on code in PR #9058:
URL: https://github.com/apache/pinot/pull/9058#discussion_r931263764


##########
pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotTaskRestletResource.java:
##########
@@ -231,6 +260,55 @@ public Map<String, PinotHelixTaskResourceManager.TaskDebugInfo> getTasksDebugInf
     return _pinotHelixTaskResourceManager.getTasksDebugInfoByTable(taskType, tableNameWithType, verbosity);
   }
 
+  @GET
+  @Produces(MediaType.APPLICATION_JSON)
+  @Path("/tasks/generator/{tableNameWithType}/{taskType}/debug")
+  @ApiOperation("Fetch task generation information for the recent runs of the given task for the given table")
+  public String getTaskGenerationDebugInto(
+      @ApiParam(value = "Task type", required = true) @PathParam("taskType") String taskType,
+      @ApiParam(value = "Table name with type", required = true) @PathParam("tableNameWithType")
+          String tableNameWithType,
+      @ApiParam(value = "Whether to only lookup local cache for logs", defaultValue = "false") @QueryParam("localOnly")
+          boolean localOnly)
+      throws JsonProcessingException {
+    if (localOnly) {
+      BaseTaskGeneratorInfo taskGeneratorMostRecentRunInfo =
+          _taskManagerStatusCache.fetchTaskGeneratorInfo(tableNameWithType, taskType);
+      if (taskGeneratorMostRecentRunInfo == null) {
+        throw new ControllerApplicationException(LOGGER, "Task generation information not found",
+            Response.Status.NOT_FOUND);
+      }
+
+      return JsonUtils.objectToString(taskGeneratorMostRecentRunInfo);
+    }
+
+    // Call all controllers
+    List<InstanceConfig> controllers = _pinotHelixResourceManager.getAllControllerInstanceConfigs();

Review Comment:
   Perhaps I was not clear. You can field the request in any controller (like you do). If that is not the lead for the table, then locate the leader controller (see the class `PinotLeadControllerRestletResource` for how to do that) and then forward the request to just the lead controller.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] mcvsubbu commented on a diff in pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on code in PR #9058:
URL: https://github.com/apache/pinot/pull/9058#discussion_r922451050


##########
pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotTaskRestletResource.java:
##########
@@ -231,6 +236,23 @@ public Map<String, PinotHelixTaskResourceManager.TaskDebugInfo> getTasksDebugInf
     return _pinotHelixTaskResourceManager.getTasksDebugInfoByTable(taskType, tableNameWithType, verbosity);
   }
 
+  @GET
+  @Path("/tasks/generator/{tableNameWithType}/{taskType}/debug")
+  @ApiOperation("Fetch task generation information for the recent runs of the given task for the given table")
+  public BaseTaskGeneratorInfo getTaskGenerationDebugInto(
+      @ApiParam(value = "Task type", required = true) @PathParam("taskType") String taskType,
+      @ApiParam(value = "Table name with type", required = true) @PathParam("tableNameWithType")
+          String tableNameWithType) {
+    BaseTaskGeneratorInfo
+        taskGeneratorMostRecentRunInfo = _taskManagerStatusCache.fetchTaskGeneratorInfo(tableNameWithType, taskType);

Review Comment:
   What if the table is not mastered on this controller?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] sajjad-moradi commented on a diff in pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
sajjad-moradi commented on code in PR #9058:
URL: https://github.com/apache/pinot/pull/9058#discussion_r922894155


##########
pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotTaskRestletResource.java:
##########
@@ -231,6 +236,23 @@ public Map<String, PinotHelixTaskResourceManager.TaskDebugInfo> getTasksDebugInf
     return _pinotHelixTaskResourceManager.getTasksDebugInfoByTable(taskType, tableNameWithType, verbosity);
   }
 
+  @GET
+  @Path("/tasks/generator/{tableNameWithType}/{taskType}/debug")
+  @ApiOperation("Fetch task generation information for the recent runs of the given task for the given table")
+  public BaseTaskGeneratorInfo getTaskGenerationDebugInto(
+      @ApiParam(value = "Task type", required = true) @PathParam("taskType") String taskType,
+      @ApiParam(value = "Table name with type", required = true) @PathParam("tableNameWithType")
+          String tableNameWithType) {
+    BaseTaskGeneratorInfo
+        taskGeneratorMostRecentRunInfo = _taskManagerStatusCache.fetchTaskGeneratorInfo(tableNameWithType, taskType);

Review Comment:
   There are two parts:
   - generating task configurations which happens in pinot code
   - submitting the task via TaskDriver which is in Helix code
   Subbu has a valid point that helix should provide some info on failure on the 2nd part - task submission. The first part though is before reaching helix and it's good if we have an API to capture issues in task generation in pinot code.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] saurabhd336 commented on a diff in pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
saurabhd336 commented on code in PR #9058:
URL: https://github.com/apache/pinot/pull/9058#discussion_r921925802


##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/minion/PinotTaskManager.java:
##########
@@ -517,7 +526,34 @@ private synchronized Map<String, String> scheduleTasks(List<String> tableNamesWi
   private String scheduleTask(PinotTaskGenerator taskGenerator, List<TableConfig> enabledTableConfigs,
       boolean isLeader) {
     LOGGER.info("Trying to schedule task type: {}, isLeader: {}", taskGenerator.getTaskType(), isLeader);
-    List<PinotTaskConfig> pinotTaskConfigs = taskGenerator.generateTasks(enabledTableConfigs);
+    List<PinotTaskConfig> pinotTaskConfigs;
+    try {
+      pinotTaskConfigs = taskGenerator.generateTasks(enabledTableConfigs);
+      for (TableConfig tableConfig : enabledTableConfigs) {
+        try {
+          _taskManagerStatusCache.saveTaskGeneratorInfo(tableConfig.getTableName(), taskGenerator.getTaskType(),
+              taskGeneratorMostRecentRunInfo -> taskGeneratorMostRecentRunInfo.addSuccessRunTs(
+                  System.currentTimeMillis()));
+        } catch (Exception exception) {
+          LOGGER.warn("Failed to save task generator success timestamp to ZK", exception);
+        }
+      }
+    } catch (Exception e) {
+      for (TableConfig tableConfig : enabledTableConfigs) {
+        try {
+          StringWriter errors = new StringWriter();

Review Comment:
   Fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang merged pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang merged PR #9058:
URL: https://github.com/apache/pinot/pull/9058


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] saurabhd336 commented on a diff in pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
saurabhd336 commented on code in PR #9058:
URL: https://github.com/apache/pinot/pull/9058#discussion_r927295691


##########
pinot-common/src/main/java/org/apache/pinot/common/minion/TaskGeneratorMostRecentRunInfo.java:
##########
@@ -0,0 +1,120 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.common.minion;
+
+import com.fasterxml.jackson.annotation.JsonPropertyOrder;
+import com.google.common.annotations.VisibleForTesting;
+import java.time.Instant;
+import java.time.OffsetDateTime;
+import java.time.ZoneId;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.TreeMap;
+import java.util.stream.Collectors;
+
+
+/**
+ * a task generator running history which keeps the most recent several success run timestamp and the most recent
+ * several error run messages.
+ */
+@JsonPropertyOrder({"tableNameWithType", "taskType", "mostRecentSuccessRunTS", "mostRecentErrorRunMessages"})
+public class TaskGeneratorMostRecentRunInfo extends BaseTaskGeneratorInfo {
+  @VisibleForTesting
+  static final int MAX_NUM_OF_HISTORY_TO_KEEP = 5;
+  private final String _taskType;
+  private final String _tableNameWithType;
+  // the timestamp to error message map of the most recent several error runs
+  private final TreeMap<Long, String> _mostRecentErrorRunMessages;
+  // the timestamp of the most recent several success runs
+  private final List<Long> _mostRecentSuccessRunTS;
+
+  private TaskGeneratorMostRecentRunInfo(String tableNameWithType, String taskType) {
+    _tableNameWithType = tableNameWithType;
+    _taskType = taskType;
+    _mostRecentErrorRunMessages = new TreeMap<>();
+    _mostRecentSuccessRunTS = new ArrayList<>();
+  }
+
+  /**
+   * Returns the table name with type
+   */
+  public String getTableNameWithType() {
+    return _tableNameWithType;
+  }
+
+  @Override
+  public String getTaskType() {
+    return _taskType;
+  }
+
+  /**
+   * Gets the timestamp to error message map of the most recent several error runs
+   */
+  public TreeMap<String, String> getMostRecentErrorRunMessages() {
+    TreeMap<String, String> result = new TreeMap<>();
+    _mostRecentErrorRunMessages.forEach((timestamp, error) -> result.put(
+        OffsetDateTime.ofInstant(Instant.ofEpochMilli(timestamp), ZoneId.of("UTC")).toString(), error));

Review Comment:
   I keeping everything UTC might be more user friendly especially when pinot clusters are deployed in different geographies. Keeping it UTC for now.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] saurabhd336 commented on a diff in pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
saurabhd336 commented on code in PR #9058:
URL: https://github.com/apache/pinot/pull/9058#discussion_r926269056


##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/minion/PinotTaskManager.java:
##########
@@ -517,7 +526,34 @@ private synchronized Map<String, String> scheduleTasks(List<String> tableNamesWi
   private String scheduleTask(PinotTaskGenerator taskGenerator, List<TableConfig> enabledTableConfigs,
       boolean isLeader) {
     LOGGER.info("Trying to schedule task type: {}, isLeader: {}", taskGenerator.getTaskType(), isLeader);
-    List<PinotTaskConfig> pinotTaskConfigs = taskGenerator.generateTasks(enabledTableConfigs);
+    List<PinotTaskConfig> pinotTaskConfigs;
+    try {
+      pinotTaskConfigs = taskGenerator.generateTasks(enabledTableConfigs);
+      for (TableConfig tableConfig : enabledTableConfigs) {

Review Comment:
   Didn't get this. This method is only called for `enabledTableConfigs` i.e. only for tables for which the particular task is enabled?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] mcvsubbu commented on a diff in pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on code in PR #9058:
URL: https://github.com/apache/pinot/pull/9058#discussion_r931564621


##########
pinot-common/src/main/java/org/apache/pinot/common/minion/TaskGeneratorMostRecentRunInfo.java:
##########
@@ -0,0 +1,151 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.common.minion;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.PriorityQueue;
+import java.util.Queue;
+import java.util.TreeMap;
+import javax.annotation.Nonnull;
+import org.apache.helix.ZNRecord;
+
+
+/**
+ * a task generator running history which keeps the most recent several success run timestamp and the most recent
+ * several error run
+ * message.
+ */
+public class TaskGeneratorMostRecentRunInfo extends BaseTaskGeneratorInfo {
+  @VisibleForTesting
+  static final int MAX_NUM_OF_HISTORY_TO_KEEP = 5;
+  @VisibleForTesting
+  static final String MOST_RECENT_SUCCESS_RUN_TS = "mostRecentSuccessRunTs";
+  @VisibleForTesting
+  static final String MOST_RECENT_ERROR_RUN_MESSAGE = "mostRecentErrorRunMessage";
+
+  private final String _taskType;
+  private final String _tableNameWithType;
+  // the timestamp to error message map of the most recent several error runs
+  @Nonnull
+  private final TreeMap<Long, String> _mostRecentErrorRunMessage;
+  // the timestamp of the most recent several success runs
+  @Nonnull
+  private final List<Long> _mostRecentSuccessRunTS;
+  private final int _version;
+
+  private TaskGeneratorMostRecentRunInfo(String tableNameWithType, String taskType,
+      Map<Long, String> mostRecentErrorRunMessage, List<Long> mostRecentSuccessRunTS, int version) {
+    _tableNameWithType = tableNameWithType;
+    _taskType = taskType;
+    // sort and keep the most recent several error messages
+    _mostRecentErrorRunMessage = new TreeMap<>();
+    mostRecentErrorRunMessage.forEach((k, v) -> {
+      _mostRecentErrorRunMessage.put(k, v);
+      if (_mostRecentErrorRunMessage.size() > MAX_NUM_OF_HISTORY_TO_KEEP) {
+        _mostRecentErrorRunMessage.remove(_mostRecentErrorRunMessage.firstKey());
+      }
+    });
+    // sort and keep the most recent several timestamp
+    Queue<Long> sortedTs = new PriorityQueue<>();
+    mostRecentSuccessRunTS.forEach(e -> {
+      sortedTs.offer(e);
+      if (sortedTs.size() > MAX_NUM_OF_HISTORY_TO_KEEP) {
+        sortedTs.poll();
+      }
+    });
+    _mostRecentSuccessRunTS = new ArrayList<>();
+    while (!sortedTs.isEmpty()) {
+      _mostRecentSuccessRunTS.add(sortedTs.poll());
+    }
+    _version = version;
+  }
+
+  /**
+   * Returns the table name with type
+   */
+  public String getTableNameWithType() {
+    return _tableNameWithType;
+  }
+
+  @Override
+  public String getTaskType() {
+    return _taskType;
+  }
+
+  /**
+   * Gets the timestamp to error message map of the most recent several error runs
+   */
+  public TreeMap<Long, String> getMostRecentErrorRunMessage() {
+    return _mostRecentErrorRunMessage;
+  }
+
+  /**
+   * Adds an error run message
+   * @param ts A timestamp
+   * @param message An error message.
+   */
+  public void addErrorRunMessage(long ts, String message) {
+    _mostRecentErrorRunMessage.put(ts, message);
+    if (_mostRecentErrorRunMessage.size() > MAX_NUM_OF_HISTORY_TO_KEEP) {
+      _mostRecentErrorRunMessage.remove(_mostRecentErrorRunMessage.firstKey());
+    }
+  }
+
+  /**
+   * Gets the timestamp of the most recent several success runs
+   */
+  public List<Long> getMostRecentSuccessRunTS() {
+    return _mostRecentSuccessRunTS;
+  }
+
+  /**
+   * Adds a success task generating run timestamp
+   * @param ts A timestamp
+   */
+  public void addSuccessRunTs(long ts) {
+    _mostRecentSuccessRunTS.add(ts);

Review Comment:
   Does this indicate a successful run (completion) of the task, or a successful generation of the task? If the task fails in minion, does your API address it?



##########
pinot-common/src/main/java/org/apache/pinot/common/minion/BaseTaskGeneratorInfo.java:
##########
@@ -0,0 +1,49 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.common.minion;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import org.apache.pinot.spi.utils.JsonUtils;
+
+
+/**
+ * Base abstract class for task generator info.
+ */
+public abstract class BaseTaskGeneratorInfo {

Review Comment:
   Do you want to add `implements Serializable`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] saurabhd336 commented on a diff in pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
saurabhd336 commented on code in PR #9058:
URL: https://github.com/apache/pinot/pull/9058#discussion_r926260938


##########
pinot-common/src/main/java/org/apache/pinot/common/minion/TaskGeneratorMostRecentRunInfo.java:
##########
@@ -0,0 +1,151 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.common.minion;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.PriorityQueue;
+import java.util.Queue;
+import java.util.TreeMap;
+import javax.annotation.Nonnull;
+import org.apache.helix.ZNRecord;
+
+
+/**
+ * a task generator running history which keeps the most recent several success run timestamp and the most recent
+ * several error run
+ * message.

Review Comment:
   Ack



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] saurabhd336 commented on a diff in pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
saurabhd336 commented on code in PR #9058:
URL: https://github.com/apache/pinot/pull/9058#discussion_r927295691


##########
pinot-common/src/main/java/org/apache/pinot/common/minion/TaskGeneratorMostRecentRunInfo.java:
##########
@@ -0,0 +1,120 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.common.minion;
+
+import com.fasterxml.jackson.annotation.JsonPropertyOrder;
+import com.google.common.annotations.VisibleForTesting;
+import java.time.Instant;
+import java.time.OffsetDateTime;
+import java.time.ZoneId;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.TreeMap;
+import java.util.stream.Collectors;
+
+
+/**
+ * a task generator running history which keeps the most recent several success run timestamp and the most recent
+ * several error run messages.
+ */
+@JsonPropertyOrder({"tableNameWithType", "taskType", "mostRecentSuccessRunTS", "mostRecentErrorRunMessages"})
+public class TaskGeneratorMostRecentRunInfo extends BaseTaskGeneratorInfo {
+  @VisibleForTesting
+  static final int MAX_NUM_OF_HISTORY_TO_KEEP = 5;
+  private final String _taskType;
+  private final String _tableNameWithType;
+  // the timestamp to error message map of the most recent several error runs
+  private final TreeMap<Long, String> _mostRecentErrorRunMessages;
+  // the timestamp of the most recent several success runs
+  private final List<Long> _mostRecentSuccessRunTS;
+
+  private TaskGeneratorMostRecentRunInfo(String tableNameWithType, String taskType) {
+    _tableNameWithType = tableNameWithType;
+    _taskType = taskType;
+    _mostRecentErrorRunMessages = new TreeMap<>();
+    _mostRecentSuccessRunTS = new ArrayList<>();
+  }
+
+  /**
+   * Returns the table name with type
+   */
+  public String getTableNameWithType() {
+    return _tableNameWithType;
+  }
+
+  @Override
+  public String getTaskType() {
+    return _taskType;
+  }
+
+  /**
+   * Gets the timestamp to error message map of the most recent several error runs
+   */
+  public TreeMap<String, String> getMostRecentErrorRunMessages() {
+    TreeMap<String, String> result = new TreeMap<>();
+    _mostRecentErrorRunMessages.forEach((timestamp, error) -> result.put(
+        OffsetDateTime.ofInstant(Instant.ofEpochMilli(timestamp), ZoneId.of("UTC")).toString(), error));

Review Comment:
   I feel keeping everything UTC might be more user friendly especially when pinot clusters are deployed in different geographies. Keeping it UTC for now.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] saurabhd336 commented on a diff in pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
saurabhd336 commented on code in PR #9058:
URL: https://github.com/apache/pinot/pull/9058#discussion_r930774838


##########
pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotTaskRestletResource.java:
##########
@@ -231,6 +260,55 @@ public Map<String, PinotHelixTaskResourceManager.TaskDebugInfo> getTasksDebugInf
     return _pinotHelixTaskResourceManager.getTasksDebugInfoByTable(taskType, tableNameWithType, verbosity);
   }
 
+  @GET
+  @Produces(MediaType.APPLICATION_JSON)
+  @Path("/tasks/generator/{tableNameWithType}/{taskType}/debug")
+  @ApiOperation("Fetch task generation information for the recent runs of the given task for the given table")
+  public String getTaskGenerationDebugInto(
+      @ApiParam(value = "Task type", required = true) @PathParam("taskType") String taskType,
+      @ApiParam(value = "Table name with type", required = true) @PathParam("tableNameWithType")
+          String tableNameWithType,
+      @ApiParam(value = "Whether to only lookup local cache for logs", defaultValue = "false") @QueryParam("localOnly")
+          boolean localOnly)
+      throws JsonProcessingException {
+    if (localOnly) {
+      BaseTaskGeneratorInfo taskGeneratorMostRecentRunInfo =
+          _taskManagerStatusCache.fetchTaskGeneratorInfo(tableNameWithType, taskType);
+      if (taskGeneratorMostRecentRunInfo == null) {
+        throw new ControllerApplicationException(LOGGER, "Task generation information not found",
+            Response.Status.NOT_FOUND);
+      }
+
+      return JsonUtils.objectToString(taskGeneratorMostRecentRunInfo);
+    }
+
+    // Call all controllers
+    List<InstanceConfig> controllers = _pinotHelixResourceManager.getAllControllerInstanceConfigs();

Review Comment:
   Since POST '/tasks/schedule' API lets one schedule tasks for a given table on any controller (non leader included), we do need to query all controllers here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] mcvsubbu commented on a diff in pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on code in PR #9058:
URL: https://github.com/apache/pinot/pull/9058#discussion_r931280071


##########
pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotTaskRestletResource.java:
##########
@@ -231,6 +260,55 @@ public Map<String, PinotHelixTaskResourceManager.TaskDebugInfo> getTasksDebugInf
     return _pinotHelixTaskResourceManager.getTasksDebugInfoByTable(taskType, tableNameWithType, verbosity);
   }
 
+  @GET
+  @Produces(MediaType.APPLICATION_JSON)
+  @Path("/tasks/generator/{tableNameWithType}/{taskType}/debug")
+  @ApiOperation("Fetch task generation information for the recent runs of the given task for the given table")
+  public String getTaskGenerationDebugInto(
+      @ApiParam(value = "Task type", required = true) @PathParam("taskType") String taskType,
+      @ApiParam(value = "Table name with type", required = true) @PathParam("tableNameWithType")
+          String tableNameWithType,
+      @ApiParam(value = "Whether to only lookup local cache for logs", defaultValue = "false") @QueryParam("localOnly")
+          boolean localOnly)
+      throws JsonProcessingException {
+    if (localOnly) {
+      BaseTaskGeneratorInfo taskGeneratorMostRecentRunInfo =
+          _taskManagerStatusCache.fetchTaskGeneratorInfo(tableNameWithType, taskType);
+      if (taskGeneratorMostRecentRunInfo == null) {
+        throw new ControllerApplicationException(LOGGER, "Task generation information not found",
+            Response.Status.NOT_FOUND);
+      }
+
+      return JsonUtils.objectToString(taskGeneratorMostRecentRunInfo);
+    }
+
+    // Call all controllers
+    List<InstanceConfig> controllers = _pinotHelixResourceManager.getAllControllerInstanceConfigs();

Review Comment:
   Ah ok, so the POST actually has _any_ controller schedule the job for a table ? Hmm... maybe we should fix hat too. But I see what you are saying here. My bad.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] npawar commented on pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
npawar commented on PR #9058:
URL: https://github.com/apache/pinot/pull/9058#issuecomment-1197315909

   > @mcvsubbu
   > 
   > 1. @npawar @Jackie-Jiang ? I might have just very rough, possibly inaccurate numbers.
   > 2. I feel the need of a control plane level API within pinot to give an overall view into current and past state of minion tasks is of importance to us. Task generator being a key part of the entire minion task flow. While metrics can help to some extent, having details like failure stack traces etc might be difficult. This api avoids having to tally metrics and debug logs from a separate log processing system.
   > 3. I suppose it could. But integrating the log processing framework into pinot APIs themselves might be a bit of a challenge. Having a system table for these kind of usecases might be the right way forward, such that pinot itself can store and serve debug data and status metrics for each of the components / flows. Essentially move from in-memory storage of logs and metrics into the system table
   > 
   > @npawar @Jackie-Jiang to add more
   
   for 1: typically for users of RealtimeToOfflineTask, tasks get generated hourly. For SegmentGenerationAndPushTask, it can be way more frequently, depending on the the number of times files are generated in the source dir. MergeRollupTasks, are less frequent, but still several a day. There might be others, but this is what we see most commonly setup by users in oss. In a typical setup, all of these would be configured. It becomes quite confusing for users to have to find the exact exception in the logs, especially because some logs are in controller (scheduler related) and some in minion (task execution related). This API will help us make the feedback loop quicker, especially when we add this into the new Minion tab on the Pinot Admin UI
   
   Regarding info already in Helix, these scheduler related exceptions are not present in the Helix generated metadata.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] mcvsubbu commented on pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on PR #9058:
URL: https://github.com/apache/pinot/pull/9058#issuecomment-1197352989

   > > @mcvsubbu
   > > 
   > > 1. @npawar @Jackie-Jiang ? I might have just very rough, possibly inaccurate numbers.
   > > 2. I feel the need of a control plane level API within pinot to give an overall view into current and past state of minion tasks is of importance to us. Task generator being a key part of the entire minion task flow. While metrics can help to some extent, having details like failure stack traces etc might be difficult. This api avoids having to tally metrics and debug logs from a separate log processing system.
   > > 3. I suppose it could. But integrating the log processing framework into pinot APIs themselves might be a bit of a challenge. Having a system table for these kind of usecases might be the right way forward, such that pinot itself can store and serve debug data and status metrics for each of the components / flows. Essentially move from in-memory storage of logs and metrics into the system table
   > > 
   > > @npawar @Jackie-Jiang to add more
   > 
   > for 1: typically for users of RealtimeToOfflineTask, tasks get generated hourly. For SegmentGenerationAndPushTask, it can be way more frequently, depending on the the number of times files are generated in the source dir. MergeRollupTasks, are less frequent, but still several a day. There might be others, but this is what we see most commonly setup by users in oss. In a typical setup, all of these would be configured. It becomes quite confusing for users to have to find the exact exception in the logs, especially because some logs are in controller (scheduler related) and some in minion (task execution related). This API will help us make the feedback loop quicker, especially when we add this into the new Minion tab on the Pinot Admin UI
   > 
   > Regarding info already in Helix, these scheduler related exceptions are not present in the Helix generated metadata.
   
   I still think log processing should be the answer to this (and other similar PRs that may come up in the future). We should not be adding a new API for every error condition we may encounter in the system (and log something).
   
   @siddharthteotia , @npawar , @Jackie-Jiang , @snleee, @kishoreg, @mayankshriv  what do you think? If the PMCs don't have any objection to this then I can live with this, but I am willing to bet that more such PRs will come up because logs are difficult to read. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] saurabhd336 commented on pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
saurabhd336 commented on PR #9058:
URL: https://github.com/apache/pinot/pull/9058#issuecomment-1196440928

   @mcvsubbu 
   1) @npawar @Jackie-Jiang ? I might have just very rough, possibly inaccurate numbers.
   
   2) I feel the need of a control plane level API within pinot to give an overall view into current and past state of minion tasks is of importance to us. Task generator being a key part of the entire minion task flow. While metrics can help to some extent, having details like failure stack traces etc might be difficult. This api avoids having to tally metrics and debug logs from a separate log processing system. 
   
   3) I suppose it could. But integrating the log processing framework into pinot APIs themselves might be a bit of a challenge. Having a system table for these kind of usecases might be the right way forward, such that pinot itself can store and serve debug data and status metrics for each of the components / flows. Essentially move from in-memory storage of logs and metrics into the system table
   
   @npawar @Jackie-Jiang to add more
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] mcvsubbu commented on a diff in pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on code in PR #9058:
URL: https://github.com/apache/pinot/pull/9058#discussion_r930314460


##########
pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotTaskRestletResource.java:
##########
@@ -231,6 +260,55 @@ public Map<String, PinotHelixTaskResourceManager.TaskDebugInfo> getTasksDebugInf
     return _pinotHelixTaskResourceManager.getTasksDebugInfoByTable(taskType, tableNameWithType, verbosity);
   }
 
+  @GET
+  @Produces(MediaType.APPLICATION_JSON)
+  @Path("/tasks/generator/{tableNameWithType}/{taskType}/debug")
+  @ApiOperation("Fetch task generation information for the recent runs of the given task for the given table")
+  public String getTaskGenerationDebugInto(
+      @ApiParam(value = "Task type", required = true) @PathParam("taskType") String taskType,
+      @ApiParam(value = "Table name with type", required = true) @PathParam("tableNameWithType")
+          String tableNameWithType,
+      @ApiParam(value = "Whether to only lookup local cache for logs", defaultValue = "false") @QueryParam("localOnly")
+          boolean localOnly)
+      throws JsonProcessingException {
+    if (localOnly) {
+      BaseTaskGeneratorInfo taskGeneratorMostRecentRunInfo =
+          _taskManagerStatusCache.fetchTaskGeneratorInfo(tableNameWithType, taskType);
+      if (taskGeneratorMostRecentRunInfo == null) {
+        throw new ControllerApplicationException(LOGGER, "Task generation information not found",
+            Response.Status.NOT_FOUND);
+      }
+
+      return JsonUtils.objectToString(taskGeneratorMostRecentRunInfo);
+    }
+
+    // Call all controllers
+    List<InstanceConfig> controllers = _pinotHelixResourceManager.getAllControllerInstanceConfigs();

Review Comment:
   you don't need to send request to all controllers. Just to the lead controller is enough



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] mcvsubbu commented on pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on PR #9058:
URL: https://github.com/apache/pinot/pull/9058#issuecomment-1195866632

   I had asked some questions about the use case you are tackling. Can you please answer those? How frequently are you generating tasks (in your use case)? Is a metric bump not enough for your use case? Can log processing help solving this use case?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] mcvsubbu commented on pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on PR #9058:
URL: https://github.com/apache/pinot/pull/9058#issuecomment-1196980176

   > @mcvsubbu
   > 
   > 1. @npawar @Jackie-Jiang ? I might have just very rough, possibly inaccurate numbers.
   > 2. I feel the need of a control plane level API within pinot to give an overall view into current and past state of minion tasks is of importance to us. Task generator being a key part of the entire minion task flow. While metrics can help to some extent, having details like failure stack traces etc might be difficult. This api avoids having to tally metrics and debug logs from a separate log processing system.
   
   I agree there is a need to debug minions. However this API does not provide any new insights other than what Helix already provides. So, let us work off of some concrete examples -- what is the use case in which you think this will help?
   
   > 3. I suppose it could. But integrating the log processing framework into pinot APIs themselves might be a bit of a challenge. Having a system table for these kind of usecases might be the right way forward, such that pinot itself can store and serve debug data and status metrics for each of the components / flows. Essentially move from in-memory storage of logs and metrics into the system table
   
   Every installation has its own way of processing logs. Cloud installations typically pipe the logs into a log processing service to get insights into it. On-prem installations have either their own way of piping the logs, or ways to get onto the host and look for the logs. 
   
   > 
   > @npawar @Jackie-Jiang to add more
   
   The reason I am reluctant to add more APIs is that the more we add, the more we need to keep backward compatibility on. My question to @saurabhd336  is : Have you looked at what Helix has to offer? If so, what are the short-comings?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] saurabhd336 commented on a diff in pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
saurabhd336 commented on code in PR #9058:
URL: https://github.com/apache/pinot/pull/9058#discussion_r926263366


##########
pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotTaskRestletResource.java:
##########
@@ -231,6 +260,55 @@ public Map<String, PinotHelixTaskResourceManager.TaskDebugInfo> getTasksDebugInf
     return _pinotHelixTaskResourceManager.getTasksDebugInfoByTable(taskType, tableNameWithType, verbosity);
   }
 
+  @GET
+  @Produces(MediaType.APPLICATION_JSON)
+  @Path("/tasks/generator/{tableNameWithType}/{taskType}/debug")
+  @ApiOperation("Fetch task generation information for the recent runs of the given task for the given table")
+  public String getTaskGenerationDebugInto(
+      @ApiParam(value = "Task type", required = true) @PathParam("taskType") String taskType,
+      @ApiParam(value = "Table name with type", required = true) @PathParam("tableNameWithType")
+          String tableNameWithType,
+      @ApiParam(value = "Whether to only lookup local cache for logs", defaultValue = "false") @QueryParam("localOnly")
+          boolean localOnly)
+      throws JsonProcessingException {
+    if (localOnly) {
+      BaseTaskGeneratorInfo taskGeneratorMostRecentRunInfo =
+          _taskManagerStatusCache.fetchTaskGeneratorInfo(tableNameWithType, taskType);
+      if (taskGeneratorMostRecentRunInfo == null) {
+        throw new ControllerApplicationException(LOGGER, "Task generation information not found",
+            Response.Status.NOT_FOUND);
+      }
+
+      return JsonUtils.objectToString(taskGeneratorMostRecentRunInfo);
+    }
+
+    // Call all controllers
+    List<InstanceConfig> controllers = _pinotHelixResourceManager.getAllControllerInstanceConfigs();
+    // Relying on original schema that was used to query the controller
+    URI uri = _uriInfo.getRequestUri();
+    String scheme = uri.getScheme();
+    List<String> controllerUrls = controllers.stream().map(controller -> {
+      return String.format("%s://%s:%d/tasks/generator/%s/%s/debug?localOnly=true", scheme, controller.getHostName(),
+          Integer.parseInt(controller.getPort()), tableNameWithType, taskType);
+    }).collect(Collectors.toList());
+
+    CompletionServiceHelper completionServiceHelper =
+        new CompletionServiceHelper(_executor, _connectionManager, HashBiMap.create(0));
+    CompletionServiceHelper.CompletionServiceResponse serviceResponse =
+        completionServiceHelper.doMultiGetRequest(controllerUrls, null, true, 10000);
+
+    List<JsonNode> result = new ArrayList<>();
+    serviceResponse._httpResponses.values().forEach(resp -> {
+      try {
+        result.add(JsonUtils.stringToJsonNode(resp));
+      } catch (IOException e) {
+        LOGGER.error("Failed to parse ");

Review Comment:
   Ack



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang commented on a diff in pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on code in PR #9058:
URL: https://github.com/apache/pinot/pull/9058#discussion_r927117765


##########
pinot-common/src/main/java/org/apache/pinot/common/minion/TaskManagerStatusCache.java:
##########
@@ -0,0 +1,32 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.common.minion;
+
+import java.util.function.Consumer;
+
+
+public abstract class TaskManagerStatusCache<T extends BaseTaskGeneratorInfo> {

Review Comment:
   Suggest changing it to an interface



##########
pinot-common/src/main/java/org/apache/pinot/common/minion/TaskGeneratorMostRecentRunInfo.java:
##########
@@ -0,0 +1,120 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.common.minion;
+
+import com.fasterxml.jackson.annotation.JsonPropertyOrder;
+import com.google.common.annotations.VisibleForTesting;
+import java.time.Instant;
+import java.time.OffsetDateTime;
+import java.time.ZoneId;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.TreeMap;
+import java.util.stream.Collectors;
+
+
+/**
+ * a task generator running history which keeps the most recent several success run timestamp and the most recent
+ * several error run messages.
+ */
+@JsonPropertyOrder({"tableNameWithType", "taskType", "mostRecentSuccessRunTS", "mostRecentErrorRunMessages"})
+public class TaskGeneratorMostRecentRunInfo extends BaseTaskGeneratorInfo {
+  @VisibleForTesting
+  static final int MAX_NUM_OF_HISTORY_TO_KEEP = 5;
+  private final String _taskType;
+  private final String _tableNameWithType;
+  // the timestamp to error message map of the most recent several error runs
+  private final TreeMap<Long, String> _mostRecentErrorRunMessages;
+  // the timestamp of the most recent several success runs
+  private final List<Long> _mostRecentSuccessRunTS;
+
+  private TaskGeneratorMostRecentRunInfo(String tableNameWithType, String taskType) {
+    _tableNameWithType = tableNameWithType;
+    _taskType = taskType;
+    _mostRecentErrorRunMessages = new TreeMap<>();
+    _mostRecentSuccessRunTS = new ArrayList<>();
+  }
+
+  /**
+   * Returns the table name with type
+   */
+  public String getTableNameWithType() {
+    return _tableNameWithType;
+  }
+
+  @Override
+  public String getTaskType() {
+    return _taskType;
+  }
+
+  /**
+   * Gets the timestamp to error message map of the most recent several error runs
+   */
+  public TreeMap<String, String> getMostRecentErrorRunMessages() {
+    TreeMap<String, String> result = new TreeMap<>();
+    _mostRecentErrorRunMessages.forEach((timestamp, error) -> result.put(
+        OffsetDateTime.ofInstant(Instant.ofEpochMilli(timestamp), ZoneId.of("UTC")).toString(), error));

Review Comment:
   (minor)
   ```suggestion
           OffsetDateTime.ofInstant(Instant.ofEpochMilli(timestamp), ZoneOffset.UTC).toString(), error));
   ```
   
   We can also consider putting `ZoneId.systemDefault()` which might be more readable, but it might also be confusing when the client is in a different timezone from the server. IMO both ways have pros and cons, so your call



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] npawar commented on a diff in pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
npawar commented on code in PR #9058:
URL: https://github.com/apache/pinot/pull/9058#discussion_r922536269


##########
pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotTaskRestletResource.java:
##########
@@ -231,6 +236,23 @@ public Map<String, PinotHelixTaskResourceManager.TaskDebugInfo> getTasksDebugInf
     return _pinotHelixTaskResourceManager.getTasksDebugInfoByTable(taskType, tableNameWithType, verbosity);
   }
 
+  @GET
+  @Path("/tasks/generator/{tableNameWithType}/{taskType}/debug")
+  @ApiOperation("Fetch task generation information for the recent runs of the given task for the given table")
+  public BaseTaskGeneratorInfo getTaskGenerationDebugInto(
+      @ApiParam(value = "Task type", required = true) @PathParam("taskType") String taskType,
+      @ApiParam(value = "Table name with type", required = true) @PathParam("tableNameWithType")
+          String tableNameWithType) {
+    BaseTaskGeneratorInfo
+        taskGeneratorMostRecentRunInfo = _taskManagerStatusCache.fetchTaskGeneratorInfo(tableNameWithType, taskType);

Review Comment:
   > @saurabhd336 thanks a lot for your contribution. It has been hard to debug minions, so I totally understand your pain point. However, I have a few general questions:
   > 
   > 1. What is the use case where you think this will help? (we have some use cases where we saw problems, and instead of introducing one new API per use case, I am looking to make sure we do something that works for more than one. @sajjad-moradi  can review from our standpoint).
   > 2. I know @npawar referred to a debug API on the minion, but I could not locate one. Please correct me if I am wrong. I think we should issue an API to the minions as well (liek we do with servers) and get the latest status form them and display it in the controller.
   
   Hey Subbu, this is the one for Minion tasks: `/tasks/task/{taskName}/debug`. In the list of subtasks returned, there's an "info" field which contains the exception (if any) encountered by this task



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] npawar commented on a diff in pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
npawar commented on code in PR #9058:
URL: https://github.com/apache/pinot/pull/9058#discussion_r922538643


##########
pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotTaskRestletResource.java:
##########
@@ -231,6 +236,23 @@ public Map<String, PinotHelixTaskResourceManager.TaskDebugInfo> getTasksDebugInf
     return _pinotHelixTaskResourceManager.getTasksDebugInfoByTable(taskType, tableNameWithType, verbosity);
   }
 
+  @GET
+  @Path("/tasks/generator/{tableNameWithType}/{taskType}/debug")
+  @ApiOperation("Fetch task generation information for the recent runs of the given task for the given table")
+  public BaseTaskGeneratorInfo getTaskGenerationDebugInto(
+      @ApiParam(value = "Task type", required = true) @PathParam("taskType") String taskType,
+      @ApiParam(value = "Table name with type", required = true) @PathParam("tableNameWithType")
+          String tableNameWithType) {
+    BaseTaskGeneratorInfo
+        taskGeneratorMostRecentRunInfo = _taskManagerStatusCache.fetchTaskGeneratorInfo(tableNameWithType, taskType);

Review Comment:
   ah good catch about the lead controller! this won't work as is, we'll have to think of how to handle that



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] npawar commented on pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
npawar commented on PR #9058:
URL: https://github.com/apache/pinot/pull/9058#issuecomment-1184850243

   > Overall, I am a plus 1 on more minion debuggability.
   > 
   > My view is that we should add APIs to minions similar to that of servers, and pull debugging info from there and display it.
   
   We already have minion APIs (look for /debug endpoint in Tasks tab). This is for scheduler errors, and the scope of scheduler remains limited to the controller, hence thought it makes most sense as an API here


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] saurabhd336 commented on a diff in pull request #9058: Task genrator debug api

Posted by GitBox <gi...@apache.org>.
saurabhd336 commented on code in PR #9058:
URL: https://github.com/apache/pinot/pull/9058#discussion_r923134910


##########
pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotTaskRestletResource.java:
##########
@@ -231,6 +236,23 @@ public Map<String, PinotHelixTaskResourceManager.TaskDebugInfo> getTasksDebugInf
     return _pinotHelixTaskResourceManager.getTasksDebugInfoByTable(taskType, tableNameWithType, verbosity);
   }
 
+  @GET
+  @Path("/tasks/generator/{tableNameWithType}/{taskType}/debug")
+  @ApiOperation("Fetch task generation information for the recent runs of the given task for the given table")
+  public BaseTaskGeneratorInfo getTaskGenerationDebugInto(
+      @ApiParam(value = "Task type", required = true) @PathParam("taskType") String taskType,
+      @ApiParam(value = "Table name with type", required = true) @PathParam("tableNameWithType")
+          String tableNameWithType) {
+    BaseTaskGeneratorInfo
+        taskGeneratorMostRecentRunInfo = _taskManagerStatusCache.fetchTaskGeneratorInfo(tableNameWithType, taskType);

Review Comment:
   Ack. The status can indeed be stored across different controllers. I was trying the following two approaches
   Upon hitting this API,
   1) The controller makes a HTTP call to every other controller to get their respective in-memory debug statuses, waits for the responses and finally responds back with a list of all the responses.
   2) Use the helix messaging framework to broadcast a message to each controller, and via the `AsyncCallback`, collate all the replies and respond back (https://github.com/saurabhd336/pinot/pull/9/files)
   
   I'll test and evaluate both approaches and update this PR accordingly.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org