You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2020/05/05 17:15:44 UTC

[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #5332: Detect segments which replicas are all in ERROR state

mcvsubbu commented on a change in pull request #5332:
URL: https://github.com/apache/incubator-pinot/pull/5332#discussion_r420264711



##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
##########
@@ -324,6 +325,7 @@ public synchronized void buildRouting(String tableNameWithType) {
     // Add time boundary manager if both offline and real-time part exist for a hybrid table
     TimeBoundaryManager timeBoundaryManager = null;
     String rawTableName = TableNameBuilder.extractRawTableName(tableNameWithType);
+    Set<String> noReplicaSegments = new HashSet<>();

Review comment:
       +1

##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/api/RequestStatistics.java
##########
@@ -47,53 +52,13 @@
   private int _numServersResponded;
   private boolean _isNumGroupsLimitReached;
   private int _numExceptions;
-  private String _brokerId;
-  private long _requestId;
-
-  public String getBrokerId() {
-    return _brokerId;
-  }
-
-  public long getRequestId() {
-    return _requestId;
-  }
-
-  public long getRequestArrivalTimeMillis() {
-    return _requestArrivalTimeMillis;
-  }
-
-  public long getReduceTimeMillis() {
-    return _reduceTimeMillis;
-  }
-
+  private boolean _queryNoReplicaSegments;

Review comment:
       ```suggestion
     private int _unavailableSegmentCount;
   ```
   
   Instead of a boolean, let us provide more information. How many segments are not available (as opposed to whether there were some segments not available)

##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
##########
@@ -332,6 +334,7 @@ public synchronized void buildRouting(String tableNameWithType) {
         timeBoundaryManager = new TimeBoundaryManager(tableConfig, _propertyStore);
         timeBoundaryManager.init(externalView, onlineSegments);
       }
+      fetchSegmentsInErrorState(externalView, noReplicaSegments);

Review comment:
       You need to do this for realtime table also, since that may also have segments in ERROR state

##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
##########
@@ -458,6 +461,35 @@ public TimeBoundaryInfo getTimeBoundaryInfo(String offlineTableName) {
     return timeBoundaryManager != null ? timeBoundaryManager.getTimeBoundaryInfo() : null;
   }
 
+  /**
+   * Fetches segments which replicas are all in ERROR state, put the result to noReplicaSegments set.
+   */
+  public void fetchSegmentsInErrorState(ExternalView externalView, Set<String> noReplicaSegments) {
+    Set<String> partitionSet = externalView.getPartitionSet();
+    for (String partitionName : partitionSet) {
+      Map<String, String> stateMap = externalView.getStateMap(partitionName);
+      int numReplicas = stateMap.size();
+      int errorCount = 0;
+      for (String segmentState : stateMap.values()) {
+        if (SegmentOnlineOfflineStateModel.ERROR.equals(segmentState)) {
+          errorCount++;
+        }

Review comment:
       you can break out of the loop if any of the segments are NOT in error state

##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
##########
@@ -458,6 +461,35 @@ public TimeBoundaryInfo getTimeBoundaryInfo(String offlineTableName) {
     return timeBoundaryManager != null ? timeBoundaryManager.getTimeBoundaryInfo() : null;
   }
 
+  /**
+   * Fetches segments which replicas are all in ERROR state, put the result to noReplicaSegments set.
+   */
+  public void fetchSegmentsInErrorState(ExternalView externalView, Set<String> noReplicaSegments) {

Review comment:
       Suggest rename method to `findUnavailableSegments` In the comments you can mention that these are only ERROR state segments. Realtime consuming segments that may be unavailable are not treated here but are accounted for in the staleness. If someone gets an idea and starts to add realtime unavailable segments here, we have a problem, right?

##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
##########
@@ -458,6 +461,35 @@ public TimeBoundaryInfo getTimeBoundaryInfo(String offlineTableName) {
     return timeBoundaryManager != null ? timeBoundaryManager.getTimeBoundaryInfo() : null;
   }
 
+  /**
+   * Fetches segments which replicas are all in ERROR state, put the result to noReplicaSegments set.
+   */
+  public void fetchSegmentsInErrorState(ExternalView externalView, Set<String> noReplicaSegments) {
+    Set<String> partitionSet = externalView.getPartitionSet();
+    for (String partitionName : partitionSet) {
+      Map<String, String> stateMap = externalView.getStateMap(partitionName);
+      int numReplicas = stateMap.size();
+      int errorCount = 0;
+      for (String segmentState : stateMap.values()) {
+        if (SegmentOnlineOfflineStateModel.ERROR.equals(segmentState)) {
+          errorCount++;
+        }
+      }
+      if (errorCount == numReplicas) {
+        noReplicaSegments.add(partitionName);
+      }
+    }
+  }
+
+  /**
+   * Checks whether the broker request matches with any segments which replicas are all in ERROR state.
+   */
+  public boolean containsNoReplicaSegments(BrokerRequest brokerRequest) {

Review comment:
       No, this is to reflect the status of the execution of a request. We want to know if there were unavailable segments.

##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
##########
@@ -458,6 +461,35 @@ public TimeBoundaryInfo getTimeBoundaryInfo(String offlineTableName) {
     return timeBoundaryManager != null ? timeBoundaryManager.getTimeBoundaryInfo() : null;
   }
 
+  /**
+   * Fetches segments which replicas are all in ERROR state, put the result to noReplicaSegments set.
+   */
+  public void fetchSegmentsInErrorState(ExternalView externalView, Set<String> noReplicaSegments) {
+    Set<String> partitionSet = externalView.getPartitionSet();
+    for (String partitionName : partitionSet) {
+      Map<String, String> stateMap = externalView.getStateMap(partitionName);
+      int numReplicas = stateMap.size();
+      int errorCount = 0;
+      for (String segmentState : stateMap.values()) {
+        if (SegmentOnlineOfflineStateModel.ERROR.equals(segmentState)) {
+          errorCount++;
+        }
+      }
+      if (errorCount == numReplicas) {
+        noReplicaSegments.add(partitionName);
+      }
+    }
+  }
+
+  /**
+   * Checks whether the broker request matches with any segments which replicas are all in ERROR state.
+   */
+  public boolean containsNoReplicaSegments(BrokerRequest brokerRequest) {

Review comment:
       ```suggestion
     public int containsNoReplicaSegments(BrokerRequest brokerRequest) {
   ```
   Return the number of segments unavailable here

##########
File path: pinot-broker/src/test/java/org/apache/pinot/broker/routing/RoutingManagerTest.java
##########
@@ -0,0 +1,114 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.broker.routing;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Map;
+import java.util.Set;
+import org.apache.helix.AccessOption;
+import org.apache.helix.BaseDataAccessor;
+import org.apache.helix.HelixDataAccessor;
+import org.apache.helix.HelixManager;
+import org.apache.helix.PropertyKey;
+import org.apache.helix.ZNRecord;
+import org.apache.helix.model.ExternalView;
+import org.apache.helix.store.zk.ZkHelixPropertyStore;
+import org.apache.pinot.common.metrics.BrokerMetrics;
+import org.apache.pinot.common.request.BrokerRequest;
+import org.apache.pinot.common.request.QuerySource;
+import org.apache.pinot.common.utils.CommonConstants.Helix.StateModel.SegmentOnlineOfflineStateModel;
+import org.apache.pinot.common.utils.config.TableConfigUtils;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.TableType;
+import org.apache.pinot.spi.utils.builder.TableConfigBuilder;
+import org.apache.pinot.spi.utils.builder.TableNameBuilder;
+import org.testng.Assert;
+import org.testng.annotations.Test;
+
+import static org.mockito.ArgumentMatchers.*;
+import static org.mockito.Mockito.mock;
+import static org.mockito.Mockito.when;
+
+
+public class RoutingManagerTest {

Review comment:
       I would add two segments, one that belongs in the partition and one that does not.
   We can evaluate that the resulting unavailable segments is 1 (and not 2)

##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
##########
@@ -551,5 +585,16 @@ void refreshSegment(String segment) {
       }
       return _instanceSelector.select(brokerRequest, selectedSegments);
     }
+
+    boolean containsNoReplicaSegments(BrokerRequest brokerRequest) {
+      if (_noReplicaSegments.isEmpty()) {
+        return false;
+      }
+      List<String> selectedSegments = new ArrayList<>(_noReplicaSegments);

Review comment:
       A comment before the for loop will be really useful. I had a the same exact question as Sidd did until I went through it a few times.
   Basically, we are trying to see if the unavailable segments would never have been hit by a query. If that is the case, then we are good for this query (e.g. in a partitioned use case, we may prune the unavailable segments).
   Please add this as a comment block.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org