You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2020/05/05 04:46:43 UTC

[GitHub] [incubator-pinot] jackjlli opened a new pull request #5332: Detect segments which replicas are all in ERROR state

jackjlli opened a new pull request #5332:
URL: https://github.com/apache/incubator-pinot/pull/5332


   Currently when pinot broker builds the routing table, the segments in routing table are always valid (either online or consuming). While the segments in ERROR state won't be captured during querying; those bad segments could have been queried. In this case, the query should be partial response.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5332: Detect segments which replicas are all in ERROR state

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #5332:
URL: https://github.com/apache/incubator-pinot/pull/5332#discussion_r420236949



##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
##########
@@ -458,6 +461,35 @@ public TimeBoundaryInfo getTimeBoundaryInfo(String offlineTableName) {
     return timeBoundaryManager != null ? timeBoundaryManager.getTimeBoundaryInfo() : null;
   }
 
+  /**
+   * Fetches segments which replicas are all in ERROR state, put the result to noReplicaSegments set.
+   */
+  public void fetchSegmentsInErrorState(ExternalView externalView, Set<String> noReplicaSegments) {

Review comment:
       Suggest changing the arg name to segmentsWithNoReplicas




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5332: Detect segments which replicas are all in ERROR state

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #5332:
URL: https://github.com/apache/incubator-pinot/pull/5332#discussion_r420251130



##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
##########
@@ -458,6 +461,35 @@ public TimeBoundaryInfo getTimeBoundaryInfo(String offlineTableName) {
     return timeBoundaryManager != null ? timeBoundaryManager.getTimeBoundaryInfo() : null;
   }
 
+  /**
+   * Fetches segments which replicas are all in ERROR state, put the result to noReplicaSegments set.
+   */
+  public void fetchSegmentsInErrorState(ExternalView externalView, Set<String> noReplicaSegments) {
+    Set<String> partitionSet = externalView.getPartitionSet();
+    for (String partitionName : partitionSet) {
+      Map<String, String> stateMap = externalView.getStateMap(partitionName);
+      int numReplicas = stateMap.size();
+      int errorCount = 0;
+      for (String segmentState : stateMap.values()) {
+        if (SegmentOnlineOfflineStateModel.ERROR.equals(segmentState)) {
+          errorCount++;
+        }
+      }
+      if (errorCount == numReplicas) {
+        noReplicaSegments.add(partitionName);
+      }
+    }
+  }
+
+  /**
+   * Checks whether the broker request matches with any segments which replicas are all in ERROR state.
+   */
+  public boolean containsNoReplicaSegments(BrokerRequest brokerRequest) {

Review comment:
       Is this for test only?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] jackjlli commented on a change in pull request #5332: Detect segments which are in ERROR state on all replicas

Posted by GitBox <gi...@apache.org>.
jackjlli commented on a change in pull request #5332:
URL: https://github.com/apache/incubator-pinot/pull/5332#discussion_r420456824



##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingTable.java
##########
@@ -0,0 +1,44 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.broker.routing;
+
+import java.util.List;
+import java.util.Map;
+import javax.annotation.Nullable;
+import org.apache.pinot.core.transport.ServerInstance;
+
+
+public class RoutingTable {
+  private Map<ServerInstance, List<String>> _serverInstanceToSegmentsMap;
+  private List<String> _segmentsWithNoReplicas;
+
+  public RoutingTable(@Nullable Map<ServerInstance, List<String>> serverInstanceToSegmentsMap,

Review comment:
       I don't want `segmentsWithNoReplicas` to be nullable. Being null could indicate a lot of meanings, and it's hard to handle. Here `segmentsWithNoReplicas` should only be either empty or non-empty. That's the only information we want.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #5332: Detect segments which replicas are all in ERROR state

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on a change in pull request #5332:
URL: https://github.com/apache/incubator-pinot/pull/5332#discussion_r420264711



##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
##########
@@ -324,6 +325,7 @@ public synchronized void buildRouting(String tableNameWithType) {
     // Add time boundary manager if both offline and real-time part exist for a hybrid table
     TimeBoundaryManager timeBoundaryManager = null;
     String rawTableName = TableNameBuilder.extractRawTableName(tableNameWithType);
+    Set<String> noReplicaSegments = new HashSet<>();

Review comment:
       +1

##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/api/RequestStatistics.java
##########
@@ -47,53 +52,13 @@
   private int _numServersResponded;
   private boolean _isNumGroupsLimitReached;
   private int _numExceptions;
-  private String _brokerId;
-  private long _requestId;
-
-  public String getBrokerId() {
-    return _brokerId;
-  }
-
-  public long getRequestId() {
-    return _requestId;
-  }
-
-  public long getRequestArrivalTimeMillis() {
-    return _requestArrivalTimeMillis;
-  }
-
-  public long getReduceTimeMillis() {
-    return _reduceTimeMillis;
-  }
-
+  private boolean _queryNoReplicaSegments;

Review comment:
       ```suggestion
     private int _unavailableSegmentCount;
   ```
   
   Instead of a boolean, let us provide more information. How many segments are not available (as opposed to whether there were some segments not available)

##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
##########
@@ -332,6 +334,7 @@ public synchronized void buildRouting(String tableNameWithType) {
         timeBoundaryManager = new TimeBoundaryManager(tableConfig, _propertyStore);
         timeBoundaryManager.init(externalView, onlineSegments);
       }
+      fetchSegmentsInErrorState(externalView, noReplicaSegments);

Review comment:
       You need to do this for realtime table also, since that may also have segments in ERROR state

##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
##########
@@ -458,6 +461,35 @@ public TimeBoundaryInfo getTimeBoundaryInfo(String offlineTableName) {
     return timeBoundaryManager != null ? timeBoundaryManager.getTimeBoundaryInfo() : null;
   }
 
+  /**
+   * Fetches segments which replicas are all in ERROR state, put the result to noReplicaSegments set.
+   */
+  public void fetchSegmentsInErrorState(ExternalView externalView, Set<String> noReplicaSegments) {
+    Set<String> partitionSet = externalView.getPartitionSet();
+    for (String partitionName : partitionSet) {
+      Map<String, String> stateMap = externalView.getStateMap(partitionName);
+      int numReplicas = stateMap.size();
+      int errorCount = 0;
+      for (String segmentState : stateMap.values()) {
+        if (SegmentOnlineOfflineStateModel.ERROR.equals(segmentState)) {
+          errorCount++;
+        }

Review comment:
       you can break out of the loop if any of the segments are NOT in error state

##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
##########
@@ -458,6 +461,35 @@ public TimeBoundaryInfo getTimeBoundaryInfo(String offlineTableName) {
     return timeBoundaryManager != null ? timeBoundaryManager.getTimeBoundaryInfo() : null;
   }
 
+  /**
+   * Fetches segments which replicas are all in ERROR state, put the result to noReplicaSegments set.
+   */
+  public void fetchSegmentsInErrorState(ExternalView externalView, Set<String> noReplicaSegments) {

Review comment:
       Suggest rename method to `findUnavailableSegments` In the comments you can mention that these are only ERROR state segments. Realtime consuming segments that may be unavailable are not treated here but are accounted for in the staleness. If someone gets an idea and starts to add realtime unavailable segments here, we have a problem, right?

##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
##########
@@ -458,6 +461,35 @@ public TimeBoundaryInfo getTimeBoundaryInfo(String offlineTableName) {
     return timeBoundaryManager != null ? timeBoundaryManager.getTimeBoundaryInfo() : null;
   }
 
+  /**
+   * Fetches segments which replicas are all in ERROR state, put the result to noReplicaSegments set.
+   */
+  public void fetchSegmentsInErrorState(ExternalView externalView, Set<String> noReplicaSegments) {
+    Set<String> partitionSet = externalView.getPartitionSet();
+    for (String partitionName : partitionSet) {
+      Map<String, String> stateMap = externalView.getStateMap(partitionName);
+      int numReplicas = stateMap.size();
+      int errorCount = 0;
+      for (String segmentState : stateMap.values()) {
+        if (SegmentOnlineOfflineStateModel.ERROR.equals(segmentState)) {
+          errorCount++;
+        }
+      }
+      if (errorCount == numReplicas) {
+        noReplicaSegments.add(partitionName);
+      }
+    }
+  }
+
+  /**
+   * Checks whether the broker request matches with any segments which replicas are all in ERROR state.
+   */
+  public boolean containsNoReplicaSegments(BrokerRequest brokerRequest) {

Review comment:
       No, this is to reflect the status of the execution of a request. We want to know if there were unavailable segments.

##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
##########
@@ -458,6 +461,35 @@ public TimeBoundaryInfo getTimeBoundaryInfo(String offlineTableName) {
     return timeBoundaryManager != null ? timeBoundaryManager.getTimeBoundaryInfo() : null;
   }
 
+  /**
+   * Fetches segments which replicas are all in ERROR state, put the result to noReplicaSegments set.
+   */
+  public void fetchSegmentsInErrorState(ExternalView externalView, Set<String> noReplicaSegments) {
+    Set<String> partitionSet = externalView.getPartitionSet();
+    for (String partitionName : partitionSet) {
+      Map<String, String> stateMap = externalView.getStateMap(partitionName);
+      int numReplicas = stateMap.size();
+      int errorCount = 0;
+      for (String segmentState : stateMap.values()) {
+        if (SegmentOnlineOfflineStateModel.ERROR.equals(segmentState)) {
+          errorCount++;
+        }
+      }
+      if (errorCount == numReplicas) {
+        noReplicaSegments.add(partitionName);
+      }
+    }
+  }
+
+  /**
+   * Checks whether the broker request matches with any segments which replicas are all in ERROR state.
+   */
+  public boolean containsNoReplicaSegments(BrokerRequest brokerRequest) {

Review comment:
       ```suggestion
     public int containsNoReplicaSegments(BrokerRequest brokerRequest) {
   ```
   Return the number of segments unavailable here

##########
File path: pinot-broker/src/test/java/org/apache/pinot/broker/routing/RoutingManagerTest.java
##########
@@ -0,0 +1,114 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.broker.routing;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Map;
+import java.util.Set;
+import org.apache.helix.AccessOption;
+import org.apache.helix.BaseDataAccessor;
+import org.apache.helix.HelixDataAccessor;
+import org.apache.helix.HelixManager;
+import org.apache.helix.PropertyKey;
+import org.apache.helix.ZNRecord;
+import org.apache.helix.model.ExternalView;
+import org.apache.helix.store.zk.ZkHelixPropertyStore;
+import org.apache.pinot.common.metrics.BrokerMetrics;
+import org.apache.pinot.common.request.BrokerRequest;
+import org.apache.pinot.common.request.QuerySource;
+import org.apache.pinot.common.utils.CommonConstants.Helix.StateModel.SegmentOnlineOfflineStateModel;
+import org.apache.pinot.common.utils.config.TableConfigUtils;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.TableType;
+import org.apache.pinot.spi.utils.builder.TableConfigBuilder;
+import org.apache.pinot.spi.utils.builder.TableNameBuilder;
+import org.testng.Assert;
+import org.testng.annotations.Test;
+
+import static org.mockito.ArgumentMatchers.*;
+import static org.mockito.Mockito.mock;
+import static org.mockito.Mockito.when;
+
+
+public class RoutingManagerTest {

Review comment:
       I would add two segments, one that belongs in the partition and one that does not.
   We can evaluate that the resulting unavailable segments is 1 (and not 2)

##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
##########
@@ -551,5 +585,16 @@ void refreshSegment(String segment) {
       }
       return _instanceSelector.select(brokerRequest, selectedSegments);
     }
+
+    boolean containsNoReplicaSegments(BrokerRequest brokerRequest) {
+      if (_noReplicaSegments.isEmpty()) {
+        return false;
+      }
+      List<String> selectedSegments = new ArrayList<>(_noReplicaSegments);

Review comment:
       A comment before the for loop will be really useful. I had a the same exact question as Sidd did until I went through it a few times.
   Basically, we are trying to see if the unavailable segments would never have been hit by a query. If that is the case, then we are good for this query (e.g. in a partitioned use case, we may prune the unavailable segments).
   Please add this as a comment block.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5332: Detect segments which replicas are all in ERROR state

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #5332:
URL: https://github.com/apache/incubator-pinot/pull/5332#discussion_r420237588



##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
##########
@@ -551,5 +585,16 @@ void refreshSegment(String segment) {
       }
       return _instanceSelector.select(brokerRequest, selectedSegments);
     }
+
+    boolean containsNoReplicaSegments(BrokerRequest brokerRequest) {

Review comment:
       Small javadoc would be good




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #5332: Detect segments which are in ERROR state on all replicas

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on a change in pull request #5332:
URL: https://github.com/apache/incubator-pinot/pull/5332#discussion_r420467067



##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingTable.java
##########
@@ -0,0 +1,44 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.broker.routing;
+
+import java.util.List;
+import java.util.Map;
+import javax.annotation.Nullable;
+import org.apache.pinot.core.transport.ServerInstance;
+
+
+public class RoutingTable {
+  private Map<ServerInstance, List<String>> _serverInstanceToSegmentsMap;
+  private List<String> _segmentsWithNoReplicas;
+
+  public RoutingTable(@Nullable Map<ServerInstance, List<String>> serverInstanceToSegmentsMap,

Review comment:
       agreed




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5332: Detect segments which replicas are all in ERROR state

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #5332:
URL: https://github.com/apache/incubator-pinot/pull/5332#discussion_r420250436



##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
##########
@@ -458,6 +461,35 @@ public TimeBoundaryInfo getTimeBoundaryInfo(String offlineTableName) {
     return timeBoundaryManager != null ? timeBoundaryManager.getTimeBoundaryInfo() : null;
   }
 
+  /**
+   * Fetches segments which replicas are all in ERROR state, put the result to noReplicaSegments set.
+   */
+  public void fetchSegmentsInErrorState(ExternalView externalView, Set<String> noReplicaSegments) {

Review comment:
       Currently this is being called when we build the routing table during init. Right? Should we also call it upon every external view update/change?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] codecov-io edited a comment on pull request #5332: Detect segments which are in ERROR state on all replicas

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #5332:
URL: https://github.com/apache/incubator-pinot/pull/5332#issuecomment-624344992


   # [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/5332?src=pr&el=h1) Report
   > Merging [#5332](https://codecov.io/gh/apache/incubator-pinot/pull/5332?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-pinot/commit/c87be271bfe4c923b59f4e68b85a51e518112e54&el=desc) will **decrease** coverage by `9.27%`.
   > The diff coverage is `40.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-pinot/pull/5332/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz)](https://codecov.io/gh/apache/incubator-pinot/pull/5332?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #5332      +/-   ##
   ==========================================
   - Coverage   66.21%   56.93%   -9.28%     
   ==========================================
     Files        1072     1074       +2     
     Lines       54552    54760     +208     
     Branches     8137     8164      +27     
   ==========================================
   - Hits        36123    31180    -4943     
   - Misses      15801    21140    +5339     
   + Partials     2628     2440     -188     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-pinot/pull/5332?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [...org/apache/pinot/broker/api/RequestStatistics.java](https://codecov.io/gh/apache/incubator-pinot/pull/5332/diff?src=pr&el=tree#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvYXBpL1JlcXVlc3RTdGF0aXN0aWNzLmphdmE=) | `0.00% <0.00%> (-64.41%)` | :arrow_down: |
   | [...e/pinot/broker/api/resources/PinotBrokerDebug.java](https://codecov.io/gh/apache/incubator-pinot/pull/5332/diff?src=pr&el=tree#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvYXBpL3Jlc291cmNlcy9QaW5vdEJyb2tlckRlYnVnLmphdmE=) | `0.00% <0.00%> (-76.67%)` | :arrow_down: |
   | [...roker/requesthandler/BaseBrokerRequestHandler.java](https://codecov.io/gh/apache/incubator-pinot/pull/5332/diff?src=pr&el=tree#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvcmVxdWVzdGhhbmRsZXIvQmFzZUJyb2tlclJlcXVlc3RIYW5kbGVyLmphdmE=) | `19.84% <0.00%> (-59.52%)` | :arrow_down: |
   | [...rg/apache/pinot/broker/routing/RoutingManager.java](https://codecov.io/gh/apache/incubator-pinot/pull/5332/diff?src=pr&el=tree#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvcm91dGluZy9Sb3V0aW5nTWFuYWdlci5qYXZh) | `68.66% <84.61%> (-12.50%)` | :arrow_down: |
   | [.../org/apache/pinot/broker/routing/RoutingTable.java](https://codecov.io/gh/apache/incubator-pinot/pull/5332/diff?src=pr&el=tree#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvcm91dGluZy9Sb3V0aW5nVGFibGUuamF2YQ==) | `100.00% <100.00%> (ø)` | |
   | [...a/org/apache/pinot/minion/metrics/MinionMeter.java](https://codecov.io/gh/apache/incubator-pinot/pull/5332/diff?src=pr&el=tree#diff-cGlub3QtbWluaW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9taW5pb24vbWV0cmljcy9NaW5pb25NZXRlci5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../apache/pinot/common/metrics/BrokerQueryPhase.java](https://codecov.io/gh/apache/incubator-pinot/pull/5332/diff?src=pr&el=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWV0cmljcy9Ccm9rZXJRdWVyeVBoYXNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../apache/pinot/minion/metrics/MinionQueryPhase.java](https://codecov.io/gh/apache/incubator-pinot/pull/5332/diff?src=pr&el=tree#diff-cGlub3QtbWluaW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9taW5pb24vbWV0cmljcy9NaW5pb25RdWVyeVBoYXNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...he/pinot/core/query/reduce/ComparisonFunction.java](https://codecov.io/gh/apache/incubator-pinot/pull/5332/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9yZWR1Y2UvQ29tcGFyaXNvbkZ1bmN0aW9uLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...pinot/minion/exception/TaskCancelledException.java](https://codecov.io/gh/apache/incubator-pinot/pull/5332/diff?src=pr&el=tree#diff-cGlub3QtbWluaW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9taW5pb24vZXhjZXB0aW9uL1Rhc2tDYW5jZWxsZWRFeGNlcHRpb24uamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | ... and [332 more](https://codecov.io/gh/apache/incubator-pinot/pull/5332/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/5332?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/5332?src=pr&el=footer). Last update [edef309...3d1983f](https://codecov.io/gh/apache/incubator-pinot/pull/5332?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] jackjlli commented on pull request #5332: Detect segments which are in ERROR state on all replicas

Posted by GitBox <gi...@apache.org>.
jackjlli commented on pull request #5332:
URL: https://github.com/apache/incubator-pinot/pull/5332#issuecomment-624427610


   As discussed offline, segments which are shown in disabled instance are also unreachable. Thus, this case will also need to be considered as unavailable as well.
   Once https://github.com/apache/incubator-pinot/pull/5337 contains the changes in RequestStatistics class, we're good to close this PR.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5332: Detect segments which replicas are all in ERROR state

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #5332:
URL: https://github.com/apache/incubator-pinot/pull/5332#discussion_r420236589



##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
##########
@@ -324,6 +325,7 @@ public synchronized void buildRouting(String tableNameWithType) {
     // Add time boundary manager if both offline and real-time part exist for a hybrid table
     TimeBoundaryManager timeBoundaryManager = null;
     String rawTableName = TableNameBuilder.extractRawTableName(tableNameWithType);
+    Set<String> noReplicaSegments = new HashSet<>();

Review comment:
       Suggest changing to segmentsWithNoReplicas

##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
##########
@@ -458,6 +461,35 @@ public TimeBoundaryInfo getTimeBoundaryInfo(String offlineTableName) {
     return timeBoundaryManager != null ? timeBoundaryManager.getTimeBoundaryInfo() : null;
   }
 
+  /**
+   * Fetches segments which replicas are all in ERROR state, put the result to noReplicaSegments set.
+   */
+  public void fetchSegmentsInErrorState(ExternalView externalView, Set<String> noReplicaSegments) {

Review comment:
       (nit): Should we say "Fetches segments which are in error state on all of of their replicas"? IIUC, this is what the function is doing




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5332: Detect segments which are in ERROR state on all replicas

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #5332:
URL: https://github.com/apache/incubator-pinot/pull/5332#discussion_r420325927



##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
##########
@@ -458,6 +461,35 @@ public TimeBoundaryInfo getTimeBoundaryInfo(String offlineTableName) {
     return timeBoundaryManager != null ? timeBoundaryManager.getTimeBoundaryInfo() : null;
   }
 
+  /**
+   * Fetches segments which replicas are all in ERROR state, put the result to noReplicaSegments set.
+   */
+  public void fetchSegmentsInErrorState(ExternalView externalView, Set<String> noReplicaSegments) {

Review comment:
       Also we should rebuild our internal state of unavailable segments upon every external view change. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5332: Detect segments which replicas are all in ERROR state

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #5332:
URL: https://github.com/apache/incubator-pinot/pull/5332#discussion_r420249691



##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
##########
@@ -551,5 +585,16 @@ void refreshSegment(String segment) {
       }
       return _instanceSelector.select(brokerRequest, selectedSegments);
     }
+
+    boolean containsNoReplicaSegments(BrokerRequest brokerRequest) {
+      if (_noReplicaSegments.isEmpty()) {
+        return false;
+      }
+      List<String> selectedSegments = new ArrayList<>(_noReplicaSegments);

Review comment:
       My bad. I see what this function is returning. Ignore point 2




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] jackjlli commented on a change in pull request #5332: Detect segments which are in ERROR state on all replicas

Posted by GitBox <gi...@apache.org>.
jackjlli commented on a change in pull request #5332:
URL: https://github.com/apache/incubator-pinot/pull/5332#discussion_r420339398



##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
##########
@@ -332,6 +334,7 @@ public synchronized void buildRouting(String tableNameWithType) {
         timeBoundaryManager = new TimeBoundaryManager(tableConfig, _propertyStore);
         timeBoundaryManager.init(externalView, onlineSegments);
       }
+      fetchSegmentsInErrorState(externalView, noReplicaSegments);

Review comment:
       Added.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5332: Detect segments which replicas are all in ERROR state

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #5332:
URL: https://github.com/apache/incubator-pinot/pull/5332#discussion_r420243869



##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
##########
@@ -551,5 +585,16 @@ void refreshSegment(String segment) {
       }
       return _instanceSelector.select(brokerRequest, selectedSegments);
     }
+
+    boolean containsNoReplicaSegments(BrokerRequest brokerRequest) {
+      if (_noReplicaSegments.isEmpty()) {
+        return false;
+      }
+      List<String> selectedSegments = new ArrayList<>(_noReplicaSegments);

Review comment:
       I am not sure I follow this completely. 
   
   1. Why do we have to invoke SegmentPruner to do this?
   
   2. The SegmentPruner API takes the second arg as the segments "to be queried" by the broker request and returns a list of selected segments which can be a subset of passed list. Here we pass the "segmentsWithNoReplicas" list to pruner. IIUC, the API will return a subset of this list which is not what we want. We want to remove the "segmentsWithNoReplicas" from the list of segments to be queried by the broker request. I don't think that is happening here. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang commented on pull request #5332: Detect segments which are in ERROR state on all replicas

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on pull request #5332:
URL: https://github.com/apache/incubator-pinot/pull/5332#issuecomment-624243740


   > Based on an offline discussion with @kishoreg :
   > 
   > Why not introduce a method in RoutingManager as:
   > `RoutingTable getRoutingTable(BrokerRequest)`
   > 
   > where
   > 
   > `RoutingTable` contains the map and the set of unavailable segments.
   
   +1. And also the logic should not be in RoutingManager but in InstanceSelector. Manager should not perform the actual calculation.
   It's a little bit hard to describe with comments, I'll create a pr demonstrating the idea


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] jackjlli commented on a change in pull request #5332: Detect segments which are in ERROR state on all replicas

Posted by GitBox <gi...@apache.org>.
jackjlli commented on a change in pull request #5332:
URL: https://github.com/apache/incubator-pinot/pull/5332#discussion_r420338275



##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
##########
@@ -324,6 +325,7 @@ public synchronized void buildRouting(String tableNameWithType) {
     // Add time boundary manager if both offline and real-time part exist for a hybrid table
     TimeBoundaryManager timeBoundaryManager = null;
     String rawTableName = TableNameBuilder.extractRawTableName(tableNameWithType);
+    Set<String> noReplicaSegments = new HashSet<>();

Review comment:
       Updated.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5332: Detect segments which replicas are all in ERROR state

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #5332:
URL: https://github.com/apache/incubator-pinot/pull/5332#discussion_r420250436



##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
##########
@@ -458,6 +461,35 @@ public TimeBoundaryInfo getTimeBoundaryInfo(String offlineTableName) {
     return timeBoundaryManager != null ? timeBoundaryManager.getTimeBoundaryInfo() : null;
   }
 
+  /**
+   * Fetches segments which replicas are all in ERROR state, put the result to noReplicaSegments set.
+   */
+  public void fetchSegmentsInErrorState(ExternalView externalView, Set<String> noReplicaSegments) {

Review comment:
       Currently this is being called when we build the routing table. Should we also call it upon every external view update/change?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] siddharthteotia commented on pull request #5332: Detect segments which replicas are all in ERROR state

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on pull request #5332:
URL: https://github.com/apache/incubator-pinot/pull/5332#issuecomment-624161352


   (nit): Suggest changing the PR title to "Detect segments which are in ERROR state on all replicas"


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] codecov-io commented on pull request #5332: Detect segments which are in ERROR state on all replicas

Posted by GitBox <gi...@apache.org>.
codecov-io commented on pull request #5332:
URL: https://github.com/apache/incubator-pinot/pull/5332#issuecomment-624344992


   # [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/5332?src=pr&el=h1) Report
   > Merging [#5332](https://codecov.io/gh/apache/incubator-pinot/pull/5332?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-pinot/commit/c87be271bfe4c923b59f4e68b85a51e518112e54&el=desc) will **decrease** coverage by `21.49%`.
   > The diff coverage is `36.18%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-pinot/pull/5332/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz)](https://codecov.io/gh/apache/incubator-pinot/pull/5332?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff             @@
   ##           master    #5332       +/-   ##
   ===========================================
   - Coverage   66.21%   44.72%   -21.50%     
   ===========================================
     Files        1072     1074        +2     
     Lines       54552    54760      +208     
     Branches     8137     8164       +27     
   ===========================================
   - Hits        36123    24491    -11632     
   - Misses      15801    28164    +12363     
   + Partials     2628     2105      -523     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-pinot/pull/5332?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [...he/pinot/core/data/function/DateTimeFunctions.java](https://codecov.io/gh/apache/incubator-pinot/pull/5332/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9kYXRhL2Z1bmN0aW9uL0RhdGVUaW1lRnVuY3Rpb25zLmphdmE=) | `0.00% <0.00%> (-66.67%)` | :arrow_down: |
   | [...t/core/data/function/DefaultFunctionEvaluator.java](https://codecov.io/gh/apache/incubator-pinot/pull/5332/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9kYXRhL2Z1bmN0aW9uL0RlZmF1bHRGdW5jdGlvbkV2YWx1YXRvci5qYXZh) | `0.00% <0.00%> (-95.75%)` | :arrow_down: |
   | [...che/pinot/core/data/function/FunctionRegistry.java](https://codecov.io/gh/apache/incubator-pinot/pull/5332/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9kYXRhL2Z1bmN0aW9uL0Z1bmN0aW9uUmVnaXN0cnkuamF2YQ==) | `0.00% <0.00%> (-70.84%)` | :arrow_down: |
   | [.../data/manager/realtime/PinotFSSegmentUploader.java](https://codecov.io/gh/apache/incubator-pinot/pull/5332/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9kYXRhL21hbmFnZXIvcmVhbHRpbWUvUGlub3RGU1NlZ21lbnRVcGxvYWRlci5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...ger/realtime/Server2ControllerSegmentUploader.java](https://codecov.io/gh/apache/incubator-pinot/pull/5332/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9kYXRhL21hbmFnZXIvcmVhbHRpbWUvU2VydmVyMkNvbnRyb2xsZXJTZWdtZW50VXBsb2FkZXIuamF2YQ==) | `64.28% <ø> (-7.15%)` | :arrow_down: |
   | [...rc/main/java/org/apache/pinot/spi/data/Schema.java](https://codecov.io/gh/apache/incubator-pinot/pull/5332/diff?src=pr&el=tree#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvZGF0YS9TY2hlbWEuamF2YQ==) | `0.00% <0.00%> (-76.09%)` | :arrow_down: |
   | [...e/operator/dociditerators/MVScanDocIdIterator.java](https://codecov.io/gh/apache/incubator-pinot/pull/5332/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci9kb2NpZGl0ZXJhdG9ycy9NVlNjYW5Eb2NJZEl0ZXJhdG9yLmphdmE=) | `67.18% <14.28%> (-2.31%)` | :arrow_down: |
   | [...rg/apache/pinot/broker/routing/RoutingManager.java](https://codecov.io/gh/apache/incubator-pinot/pull/5332/diff?src=pr&el=tree#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvcm91dGluZy9Sb3V0aW5nTWFuYWdlci5qYXZh) | `76.40% <53.84%> (-4.75%)` | :arrow_down: |
   | [...e/operator/dociditerators/SVScanDocIdIterator.java](https://codecov.io/gh/apache/incubator-pinot/pull/5332/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci9kb2NpZGl0ZXJhdG9ycy9TVlNjYW5Eb2NJZEl0ZXJhdG9yLmphdmE=) | `66.66% <60.00%> (-2.85%)` | :arrow_down: |
   | [...org/apache/pinot/broker/api/RequestStatistics.java](https://codecov.io/gh/apache/incubator-pinot/pull/5332/diff?src=pr&el=tree#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvYXBpL1JlcXVlc3RTdGF0aXN0aWNzLmphdmE=) | `64.51% <66.66%> (+0.10%)` | :arrow_up: |
   | ... and [605 more](https://codecov.io/gh/apache/incubator-pinot/pull/5332/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/5332?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/5332?src=pr&el=footer). Last update [edef309...3d1983f](https://codecov.io/gh/apache/incubator-pinot/pull/5332?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] jackjlli commented on a change in pull request #5332: Detect segments which are in ERROR state on all replicas

Posted by GitBox <gi...@apache.org>.
jackjlli commented on a change in pull request #5332:
URL: https://github.com/apache/incubator-pinot/pull/5332#discussion_r420339516



##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
##########
@@ -458,6 +461,35 @@ public TimeBoundaryInfo getTimeBoundaryInfo(String offlineTableName) {
     return timeBoundaryManager != null ? timeBoundaryManager.getTimeBoundaryInfo() : null;
   }
 
+  /**
+   * Fetches segments which replicas are all in ERROR state, put the result to noReplicaSegments set.
+   */
+  public void fetchSegmentsInErrorState(ExternalView externalView, Set<String> noReplicaSegments) {
+    Set<String> partitionSet = externalView.getPartitionSet();
+    for (String partitionName : partitionSet) {
+      Map<String, String> stateMap = externalView.getStateMap(partitionName);
+      int numReplicas = stateMap.size();
+      int errorCount = 0;
+      for (String segmentState : stateMap.values()) {
+        if (SegmentOnlineOfflineStateModel.ERROR.equals(segmentState)) {
+          errorCount++;
+        }

Review comment:
       Added.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #5332: Detect segments which are in ERROR state on all replicas

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on a change in pull request #5332:
URL: https://github.com/apache/incubator-pinot/pull/5332#discussion_r420446629



##########
File path: pinot-broker/src/test/java/org/apache/pinot/broker/routing/RoutingManagerTest.java
##########
@@ -0,0 +1,114 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.broker.routing;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Map;
+import java.util.Set;
+import org.apache.helix.AccessOption;
+import org.apache.helix.BaseDataAccessor;
+import org.apache.helix.HelixDataAccessor;
+import org.apache.helix.HelixManager;
+import org.apache.helix.PropertyKey;
+import org.apache.helix.ZNRecord;
+import org.apache.helix.model.ExternalView;
+import org.apache.helix.store.zk.ZkHelixPropertyStore;
+import org.apache.pinot.common.metrics.BrokerMetrics;
+import org.apache.pinot.common.request.BrokerRequest;
+import org.apache.pinot.common.request.QuerySource;
+import org.apache.pinot.common.utils.CommonConstants.Helix.StateModel.SegmentOnlineOfflineStateModel;
+import org.apache.pinot.common.utils.config.TableConfigUtils;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.TableType;
+import org.apache.pinot.spi.utils.builder.TableConfigBuilder;
+import org.apache.pinot.spi.utils.builder.TableNameBuilder;
+import org.testng.Assert;
+import org.testng.annotations.Test;
+
+import static org.mockito.ArgumentMatchers.*;
+import static org.mockito.Mockito.mock;
+import static org.mockito.Mockito.when;
+
+
+public class RoutingManagerTest {

Review comment:
       I meant two segments in ERROR state, one gets pruned because the segment in error is not accessed. 
   this will be nice to have.

##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingTable.java
##########
@@ -0,0 +1,44 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.broker.routing;
+
+import java.util.List;
+import java.util.Map;
+import javax.annotation.Nullable;
+import org.apache.pinot.core.transport.ServerInstance;
+
+
+public class RoutingTable {
+  private Map<ServerInstance, List<String>> _serverInstanceToSegmentsMap;
+  private List<String> _segmentsWithNoReplicas;
+
+  public RoutingTable(@Nullable Map<ServerInstance, List<String>> serverInstanceToSegmentsMap,

Review comment:
       nit: Both arguments are nullable

##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/requesthandler/BaseBrokerRequestHandler.java
##########
@@ -284,31 +287,43 @@ public BrokerResponse handleRequest(JsonNode request, @Nullable RequesterIdentit
 
     // Calculate routing table for the query
     long routingStartTimeNs = System.nanoTime();
-    Map<ServerInstance, List<String>> offlineRoutingTable = null;
-    Map<ServerInstance, List<String>> realtimeRoutingTable = null;
+    Map<ServerInstance, List<String>> offlineServerInstanceToSegmentsMap = null;
+    Map<ServerInstance, List<String>> realtimeServerInstanceToSegmentsMap = null;
+    // Set number of segments with no replicas matched with this query
+    int unavailableSegmentCount = 0;
     if (offlineBrokerRequest != null) {
       // NOTE: Routing table might be null if table is just removed
-      offlineRoutingTable = _routingManager.getRoutingTable(offlineBrokerRequest);
-      if (offlineRoutingTable == null || offlineRoutingTable.isEmpty()) {
+      RoutingTable offlineRoutingTable = _routingManager.getRoutingTable(offlineBrokerRequest);
+      if (MapUtils.isEmpty(offlineRoutingTable.getServerInstanceToSegmentsMap())) {
         LOGGER.debug("No OFFLINE server found for request {}: {}", requestId, query);
         offlineBrokerRequest = null;
-        offlineRoutingTable = null;
+        offlineServerInstanceToSegmentsMap = null;
+      } else {
+        offlineServerInstanceToSegmentsMap = offlineRoutingTable.getServerInstanceToSegmentsMap();
+        unavailableSegmentCount += offlineRoutingTable.getSegmentsWithNoReplicas().size();
       }
     }
     if (realtimeBrokerRequest != null) {
       // NOTE: Routing table might be null if table is just removed
-      realtimeRoutingTable = _routingManager.getRoutingTable(realtimeBrokerRequest);
-      if (realtimeRoutingTable == null || realtimeRoutingTable.isEmpty()) {
+      RoutingTable realtimeRoutingTable = _routingManager.getRoutingTable(realtimeBrokerRequest);
+      if (MapUtils.isEmpty(realtimeRoutingTable.getServerInstanceToSegmentsMap())) {
         LOGGER.debug("No REALTIME server found for request {}: {}", requestId, query);
         realtimeBrokerRequest = null;
-        realtimeRoutingTable = null;
+        realtimeServerInstanceToSegmentsMap = null;
+      } else {
+        realtimeServerInstanceToSegmentsMap = realtimeRoutingTable.getServerInstanceToSegmentsMap();
+        unavailableSegmentCount += realtimeRoutingTable.getSegmentsWithNoReplicas().size();
       }
     }
     if (offlineBrokerRequest == null && realtimeBrokerRequest == null) {
       LOGGER.info("No server found for request {}: {}", requestId, query);
       _brokerMetrics.addMeteredTableValue(rawTableName, BrokerMeter.NO_SERVER_FOUND_EXCEPTIONS, 1);
       return BrokerResponseNative.EMPTY_RESULT;
     }
+
+    // Set number of segments with no replicas matched with this query
+    requestStatistics.setUnavailableSegmentCount(unavailableSegmentCount);

Review comment:
       Also log this in the (one) log statement that we do for every request:
   ```
   segments(queried/processed/matched/consuming):{}/{}/{}/{}
   ```
   Add "unavail" here




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] jackjlli commented on a change in pull request #5332: Detect segments which are in ERROR state on all replicas

Posted by GitBox <gi...@apache.org>.
jackjlli commented on a change in pull request #5332:
URL: https://github.com/apache/incubator-pinot/pull/5332#discussion_r420339011



##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
##########
@@ -458,6 +461,35 @@ public TimeBoundaryInfo getTimeBoundaryInfo(String offlineTableName) {
     return timeBoundaryManager != null ? timeBoundaryManager.getTimeBoundaryInfo() : null;
   }
 
+  /**
+   * Fetches segments which replicas are all in ERROR state, put the result to noReplicaSegments set.
+   */
+  public void fetchSegmentsInErrorState(ExternalView externalView, Set<String> noReplicaSegments) {

Review comment:
       Correct. The logic is in `findUnavailableSegments()` in Line 327.

##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
##########
@@ -551,5 +585,16 @@ void refreshSegment(String segment) {
       }
       return _instanceSelector.select(brokerRequest, selectedSegments);
     }
+
+    boolean containsNoReplicaSegments(BrokerRequest brokerRequest) {

Review comment:
       Added.

##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
##########
@@ -551,5 +585,16 @@ void refreshSegment(String segment) {
       }
       return _instanceSelector.select(brokerRequest, selectedSegments);
     }
+
+    boolean containsNoReplicaSegments(BrokerRequest brokerRequest) {
+      if (_noReplicaSegments.isEmpty()) {
+        return false;
+      }
+      List<String> selectedSegments = new ArrayList<>(_noReplicaSegments);

Review comment:
       Comment added.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org