You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2020/11/23 22:48:17 UTC

[GitHub] [incubator-pinot] dharakk opened a new pull request #6286: Adding offline dimension table creation and segment assignment

dharakk opened a new pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286


   This PR adds creation and segment assignment of a dim table. Following are the high level changes:
   
   - We are adding a new configuration property `isDimTable` to identify a dim table.
   - If the table is identified as a dim table the segments will be assigned to all the hosts.
   - Adding a new `TableDataManagerType` which will be used to create a DimTableDataManager if the following PRs
   - Adding a `JoinQuickStart` to be used to demo an example dim table and join functinality.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] chenboat commented on a change in pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
chenboat commented on a change in pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#discussion_r534424189



##########
File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/segment/OfflineDimTableSegmentAssignment.java
##########
@@ -0,0 +1,80 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.controller.helix.core.assignment.segment;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.configuration.Configuration;
+import org.apache.helix.HelixManager;
+import org.apache.pinot.common.assignment.InstancePartitions;
+import org.apache.pinot.common.tier.Tier;
+import org.apache.pinot.common.utils.CommonConstants;
+import org.apache.pinot.common.utils.config.TagNameUtils;
+import org.apache.pinot.common.utils.helix.HelixHelper;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.TenantConfig;
+import org.apache.pinot.spi.config.table.assignment.InstancePartitionsType;
+
+import javax.annotation.Nullable;
+
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.TreeMap;
+
+public class OfflineDimTableSegmentAssignment implements SegmentAssignment {

Review comment:
       Can you add more javadoc to this new class? It should explain what is the property and difference with other segment assignment strategy of the new DimTableSegmentAssignment?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang merged pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang merged pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#discussion_r540467147



##########
File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/segment/OfflineDimTableSegmentAssignment.java
##########
@@ -0,0 +1,93 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.controller.helix.core.assignment.segment;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.configuration.Configuration;
+import org.apache.helix.HelixManager;
+import org.apache.pinot.common.assignment.InstancePartitions;
+import org.apache.pinot.common.tier.Tier;
+import org.apache.pinot.common.utils.CommonConstants;
+import org.apache.pinot.common.utils.config.TagNameUtils;
+import org.apache.pinot.common.utils.helix.HelixHelper;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.TenantConfig;
+import org.apache.pinot.spi.config.table.assignment.InstancePartitionsType;
+
+import javax.annotation.Nullable;
+
+import java.util.List;
+import java.util.Map;
+import java.util.TreeMap;
+
+
+/**
+ * Segment assignment for an offline dimension table.
+ * <ul>
+ *   <li>
+ *     <p>This segment assignment strategy is used when {@link TableConfig#IS_DIM_TABLE_KEY}is
+ *     set to "true".</p>
+ *   </li>
+ *   <li>
+ *     <p>For a dimension table we assign the segment to all the hosts Thus for this assignment
+ *     strategy we simply return all the hosts under a given tag as the assigned hosts for
+ *     a given segment.</p>
+ *   </li>
+ * </ul>
+ */
+public class OfflineDimTableSegmentAssignment implements SegmentAssignment {
+
+  private HelixManager _helixManager;
+  private String _offlineTableName;
+  private TenantConfig _tenantConfig;
+
+  @Override
+  public void init(HelixManager helixManager, TableConfig tableConfig) {
+    _helixManager = helixManager;
+    _offlineTableName = tableConfig.getTableName();
+    _tenantConfig = tableConfig.getTenantConfig();
+    Preconditions.checkState(tableConfig.isDimTable(), "Not a dimension table: %s" + _offlineTableName);
+  }
+
+  @Override
+  public List<String> assignSegment(String segmentName, Map<String, Map<String, String>> currentAssignment,
+      Map<InstancePartitionsType, InstancePartitions> instancePartitionsMap) {
+    String serverTag = TagNameUtils.extractOfflineServerTag(_tenantConfig);
+
+    List<String> instances = HelixHelper.getInstancesWithTag(_helixManager, serverTag);
+    int numInstances = instances.size();
+    Preconditions.checkState(numInstances > 0, "No instance found with tag: %s", serverTag);

Review comment:
       Yes it can. We can add more host to this tag (tenant), but if there is no host for the tag, we should throw exception because no server will host the segment




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] dharakk commented on a change in pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
dharakk commented on a change in pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#discussion_r539726105



##########
File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/segment/OfflineDimTableSegmentAssignment.java
##########
@@ -0,0 +1,93 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.controller.helix.core.assignment.segment;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.configuration.Configuration;
+import org.apache.helix.HelixManager;
+import org.apache.pinot.common.assignment.InstancePartitions;
+import org.apache.pinot.common.tier.Tier;
+import org.apache.pinot.common.utils.CommonConstants;
+import org.apache.pinot.common.utils.config.TagNameUtils;
+import org.apache.pinot.common.utils.helix.HelixHelper;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.TenantConfig;
+import org.apache.pinot.spi.config.table.assignment.InstancePartitionsType;
+
+import javax.annotation.Nullable;
+
+import java.util.List;
+import java.util.Map;
+import java.util.TreeMap;
+
+
+/**
+ * Segment assignment for an offline dimension table.
+ * <ul>
+ *   <li>
+ *     <p>This segment assignment strategy is used when {@link TableConfig#IS_DIM_TABLE_KEY}is
+ *     set to "true".</p>
+ *   </li>
+ *   <li>
+ *     <p>For a dimension table we assign the segment to all the hosts Thus for this assignment
+ *     strategy we simply return all the hosts under a given tag as the assigned hosts for
+ *     a given segment.</p>
+ *   </li>
+ * </ul>
+ */
+public class OfflineDimTableSegmentAssignment implements SegmentAssignment {
+
+  private HelixManager _helixManager;

Review comment:
       Unable to make these final as the initialization is happening in `init()` method and not the constructor.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] codecov-io edited a comment on pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#issuecomment-732496192


   # [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6286?src=pr&el=h1) Report
   > Merging [#6286](https://codecov.io/gh/apache/incubator-pinot/pull/6286?src=pr&el=desc) (f630744) into [master](https://codecov.io/gh/apache/incubator-pinot/commit/1beaab59b73f26c4e35f3b9bc856b03806cddf5a?el=desc) (1beaab5) will **decrease** coverage by `1.06%`.
   > The diff coverage is `45.12%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-pinot/pull/6286/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz)](https://codecov.io/gh/apache/incubator-pinot/pull/6286?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #6286      +/-   ##
   ==========================================
   - Coverage   66.44%   65.38%   -1.07%     
   ==========================================
     Files        1075     1286     +211     
     Lines       54773    62061    +7288     
     Branches     8168     9004     +836     
   ==========================================
   + Hits        36396    40576    +4180     
   - Misses      15700    18597    +2897     
   - Partials     2677     2888     +211     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | unittests | `65.38% <45.12%> (?)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-pinot/pull/6286?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [...e/pinot/broker/api/resources/PinotBrokerDebug.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvYXBpL3Jlc291cmNlcy9QaW5vdEJyb2tlckRlYnVnLmphdmE=) | `0.00% <0.00%> (-79.32%)` | :arrow_down: |
   | [...ot/broker/broker/AllowAllAccessControlFactory.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvYnJva2VyL0FsbG93QWxsQWNjZXNzQ29udHJvbEZhY3RvcnkuamF2YQ==) | `71.42% <ø> (-28.58%)` | :arrow_down: |
   | [.../helix/BrokerUserDefinedMessageHandlerFactory.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvYnJva2VyL2hlbGl4L0Jyb2tlclVzZXJEZWZpbmVkTWVzc2FnZUhhbmRsZXJGYWN0b3J5LmphdmE=) | `33.96% <0.00%> (-32.71%)` | :arrow_down: |
   | [...ker/routing/instanceselector/InstanceSelector.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvcm91dGluZy9pbnN0YW5jZXNlbGVjdG9yL0luc3RhbmNlU2VsZWN0b3IuamF2YQ==) | `100.00% <ø> (ø)` | |
   | [...ava/org/apache/pinot/client/AbstractResultSet.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtY2xpZW50cy9waW5vdC1qYXZhLWNsaWVudC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY2xpZW50L0Fic3RyYWN0UmVzdWx0U2V0LmphdmE=) | `66.66% <0.00%> (+9.52%)` | :arrow_up: |
   | [.../main/java/org/apache/pinot/client/Connection.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtY2xpZW50cy9waW5vdC1qYXZhLWNsaWVudC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY2xpZW50L0Nvbm5lY3Rpb24uamF2YQ==) | `35.55% <0.00%> (-13.29%)` | :arrow_down: |
   | [...inot/client/JsonAsyncHttpPinotClientTransport.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtY2xpZW50cy9waW5vdC1qYXZhLWNsaWVudC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY2xpZW50L0pzb25Bc3luY0h0dHBQaW5vdENsaWVudFRyYW5zcG9ydC5qYXZh) | `10.90% <0.00%> (-51.10%)` | :arrow_down: |
   | [...not/common/assignment/InstancePartitionsUtils.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vYXNzaWdubWVudC9JbnN0YW5jZVBhcnRpdGlvbnNVdGlscy5qYXZh) | `73.80% <ø> (+0.63%)` | :arrow_up: |
   | [.../apache/pinot/common/exception/QueryException.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vZXhjZXB0aW9uL1F1ZXJ5RXhjZXB0aW9uLmphdmE=) | `90.27% <ø> (+5.55%)` | :arrow_up: |
   | [...pinot/common/function/AggregationFunctionType.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vZnVuY3Rpb24vQWdncmVnYXRpb25GdW5jdGlvblR5cGUuamF2YQ==) | `100.00% <ø> (ø)` | |
   | ... and [1134 more](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6286?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6286?src=pr&el=footer). Last update [fa7b0e4...f630744](https://codecov.io/gh/apache/incubator-pinot/pull/6286?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#discussion_r529805363



##########
File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/segment/SegmentAssignmentUtils.java
##########
@@ -87,6 +87,15 @@ private SegmentAssignmentUtils() {
     return instances;
   }
 
+  static List<String> getInstancesForDimTable(InstancePartitions instancePartitions) {

Review comment:
       We don't need instance partitions for dimTable. It should be assigned to all the instances under the given tag




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] siddharthteotia commented on pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#issuecomment-737659210


   I would like to review this PR. Please give me a day or two. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] dharakk commented on pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
dharakk commented on pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#issuecomment-732476021


   > Can you please add a link to your design doc for join? Thanks
   Added in the description


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on a change in pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#discussion_r544714804



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/data/manager/TableDataManagerType.java
##########
@@ -0,0 +1,23 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.data.manager;
+
+public enum TableDataManagerType {

Review comment:
       @dharakk  why was this suggestion not implemented?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] dharakk commented on a change in pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
dharakk commented on a change in pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#discussion_r539726234



##########
File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/segment/OfflineDimTableSegmentAssignment.java
##########
@@ -0,0 +1,93 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.controller.helix.core.assignment.segment;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.configuration.Configuration;
+import org.apache.helix.HelixManager;
+import org.apache.pinot.common.assignment.InstancePartitions;
+import org.apache.pinot.common.tier.Tier;
+import org.apache.pinot.common.utils.CommonConstants;
+import org.apache.pinot.common.utils.config.TagNameUtils;
+import org.apache.pinot.common.utils.helix.HelixHelper;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.TenantConfig;
+import org.apache.pinot.spi.config.table.assignment.InstancePartitionsType;
+
+import javax.annotation.Nullable;
+
+import java.util.List;
+import java.util.Map;
+import java.util.TreeMap;
+
+
+/**
+ * Segment assignment for an offline dimension table.
+ * <ul>
+ *   <li>
+ *     <p>This segment assignment strategy is used when {@link TableConfig#IS_DIM_TABLE_KEY}is
+ *     set to "true".</p>
+ *   </li>
+ *   <li>
+ *     <p>For a dimension table we assign the segment to all the hosts Thus for this assignment
+ *     strategy we simply return all the hosts under a given tag as the assigned hosts for
+ *     a given segment.</p>
+ *   </li>
+ * </ul>
+ */
+public class OfflineDimTableSegmentAssignment implements SegmentAssignment {
+
+  private HelixManager _helixManager;
+  private String _offlineTableName;
+  private TenantConfig _tenantConfig;

Review comment:
       Same as above




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] dharakk commented on a change in pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
dharakk commented on a change in pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#discussion_r533171720



##########
File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/segment/OfflineSegmentAssignment.java
##########
@@ -133,6 +135,10 @@ private void checkReplication(InstancePartitions instancePartitions) {
   private List<String> assignSegment(String segmentName, Map<String, Map<String, String>> currentAssignment,
       InstancePartitions instancePartitions) {
     int numReplicaGroups = instancePartitions.getNumReplicaGroups();
+    if (_isDimTable) {

Review comment:
       Thanks for the suggestion, this makes the code much cleaner and also allows to add an implementation of `rebalanceTable` method.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#discussion_r529810957



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/data/manager/TableDataManagerType.java
##########
@@ -0,0 +1,23 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.data.manager;
+
+public enum TableDataManagerType {

Review comment:
       If possible, let's not introduce this new type. One option is to add the `DIMENSION` as a `TableType`, but I'm not sure if we are going to support real-time dimTable in the future (we can add another `REALTIME_DIMENSION` maybe)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#discussion_r537795504



##########
File path: pinot-common/src/main/java/org/apache/pinot/common/utils/config/TableConfigUtils.java
##########
@@ -58,6 +58,7 @@ public static TableConfig fromZNRecord(ZNRecord znRecord)
     String tableName = znRecord.getId();
 
     String tableType = simpleFields.get(TableConfig.TABLE_TYPE_KEY);
+    String isDimTable = simpleFields.get(TableConfig.IS_DIM_TABLE_KEY);

Review comment:
       Store as boolean for clarity?

##########
File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/segment/OfflineDimTableSegmentAssignment.java
##########
@@ -0,0 +1,94 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.controller.helix.core.assignment.segment;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.configuration.Configuration;
+import org.apache.helix.HelixManager;
+import org.apache.pinot.common.assignment.InstancePartitions;
+import org.apache.pinot.common.tier.Tier;
+import org.apache.pinot.common.utils.CommonConstants;
+import org.apache.pinot.common.utils.config.TagNameUtils;
+import org.apache.pinot.common.utils.helix.HelixHelper;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.TenantConfig;
+import org.apache.pinot.spi.config.table.assignment.InstancePartitionsType;
+
+import javax.annotation.Nullable;
+
+import java.util.List;
+import java.util.Map;
+import java.util.TreeMap;
+
+
+/**
+ * Segment assignment for an offline dimension table.
+ * <ul>
+ *   <li>
+ *     <p>This segment assignment strategy is used when {@link TableConfig#IS_DIM_TABLE_KEY}is
+ *     set to "true".</p>
+ *   </li>
+ *   <li>
+ *     <p>For a dimension table we assign the segment to all the hosts Thus for this assignment
+ *     strategy we simply return all the hosts under a given tag as the assigned hosts for
+ *     a given segment.</p>
+ *   </li>
+ * </ul>
+ */
+public class OfflineDimTableSegmentAssignment implements SegmentAssignment {
+
+  private HelixManager _helixManager;
+  private String _offlineTableName;
+  private boolean _isDimTable;

Review comment:
       No need to keep this member variable

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/data/manager/config/TableDataManagerConfig.java
##########
@@ -35,6 +35,7 @@
   private static final String TABLE_DATA_MANAGER_DATA_DIRECTORY = "directory";
   private static final String TABLE_DATA_MANAGER_CONSUMER_DIRECTORY = "consumerDirectory";
   private static final String TABLE_DATA_MANAGER_NAME = "name";
+  private static final String TABLE_IS_DIMENSION = "isDimTable";

Review comment:
       We should not have this config on instance level. It does not make sense for a server to only serve dimension table

##########
File path: pinot-spi/src/main/java/org/apache/pinot/spi/utils/builder/TableConfigBuilder.java
##########
@@ -55,6 +55,7 @@
 
   private final TableType _tableType;
   private String _tableName;
+  private String _isDimTable;

Review comment:
       Store boolean

##########
File path: pinot-tools/src/main/java/org/apache/pinot/tools/JoinQuickStart.java
##########
@@ -0,0 +1,127 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.tools;
+
+import com.google.common.base.Preconditions;
+import com.google.common.collect.Lists;
+import org.apache.commons.io.FileUtils;
+import org.apache.pinot.spi.plugin.PluginManager;
+import org.apache.pinot.tools.admin.command.QuickstartRunner;
+
+import java.io.File;
+import java.net.URL;
+
+import static org.apache.pinot.tools.Quickstart.prettyPrintResponse;
+import static org.apache.pinot.tools.Quickstart.printStatus;
+
+
+public class JoinQuickStart {

Review comment:
       Don't add this quick start yet. We can add it along with the lookup feature support

##########
File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/segment/OfflineDimTableSegmentAssignment.java
##########
@@ -0,0 +1,94 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.controller.helix.core.assignment.segment;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.configuration.Configuration;
+import org.apache.helix.HelixManager;
+import org.apache.pinot.common.assignment.InstancePartitions;
+import org.apache.pinot.common.tier.Tier;
+import org.apache.pinot.common.utils.CommonConstants;
+import org.apache.pinot.common.utils.config.TagNameUtils;
+import org.apache.pinot.common.utils.helix.HelixHelper;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.TenantConfig;
+import org.apache.pinot.spi.config.table.assignment.InstancePartitionsType;
+
+import javax.annotation.Nullable;
+
+import java.util.List;
+import java.util.Map;
+import java.util.TreeMap;
+
+
+/**
+ * Segment assignment for an offline dimension table.
+ * <ul>
+ *   <li>
+ *     <p>This segment assignment strategy is used when {@link TableConfig#IS_DIM_TABLE_KEY}is
+ *     set to "true".</p>
+ *   </li>
+ *   <li>
+ *     <p>For a dimension table we assign the segment to all the hosts Thus for this assignment
+ *     strategy we simply return all the hosts under a given tag as the assigned hosts for
+ *     a given segment.</p>
+ *   </li>
+ * </ul>
+ */
+public class OfflineDimTableSegmentAssignment implements SegmentAssignment {
+
+  private HelixManager _helixManager;
+  private String _offlineTableName;
+  private boolean _isDimTable;
+  private TenantConfig _tenantConfig;
+
+  @Override
+  public void init(HelixManager helixManager, TableConfig tableConfig) {
+    _helixManager = helixManager;
+    _offlineTableName = tableConfig.getTableName();
+    _tenantConfig = tableConfig.getTenantConfig();
+    _isDimTable = Boolean.parseBoolean(tableConfig.getIsDimTable());
+    Preconditions.checkState(_isDimTable, "Not a dimension table: %s" + _offlineTableName);
+  }
+
+  @Override
+  public List<String> assignSegment(String segmentName, Map<String, Map<String, String>> currentAssignment,
+      Map<InstancePartitionsType, InstancePartitions> instancePartitionsMap) {
+    String serverTag = TagNameUtils.extractOfflineServerTag(_tenantConfig);
+
+    List<String> instances = HelixHelper.getInstancesWithTag(_helixManager, serverTag);
+    int numInstances = instances.size();
+    Preconditions.checkState(numInstances > 0, "No instance found with tag: %s", serverTag);
+
+    return instances;
+  }
+
+  @Override
+  public Map<String, Map<String, String>> rebalanceTable(Map<String, Map<String, String>> currentAssignment,
+      Map<InstancePartitionsType, InstancePartitions> instancePartitionsMap, @Nullable List<Tier> sortedTiers,
+      @Nullable Map<String, InstancePartitions> tierInstancePartitionsMap, Configuration config) {
+    Map<String, Map<String, String>> newAssignment = new TreeMap<>();
+    for (String segment : currentAssignment.keySet()) {

Review comment:
       No need to assign per segment. Fetch the instances with the tag once, and construct the new assignment

##########
File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/segment/SegmentAssignmentFactory.java
##########
@@ -32,7 +32,9 @@ private SegmentAssignmentFactory() {
 
   public static SegmentAssignment getSegmentAssignment(HelixManager helixManager, TableConfig tableConfig) {
     SegmentAssignment segmentAssignment;
-    if (tableConfig.getTableType() == TableType.OFFLINE) {
+    if (tableConfig.getTableType() == TableType.OFFLINE && Boolean.parseBoolean(tableConfig.getIsDimTable())) {
+      segmentAssignment = new OfflineDimTableSegmentAssignment();
+    } else if (tableConfig.getTableType() == TableType.OFFLINE) {

Review comment:
       (nit)
   ```suggestion
       if (tableConfig.getTableType() == TableType.OFFLINE) {
         return tableConfig.isDimTable() ? new OfflineDimTableSegmentAssignment() : new OfflineSegmentAssignment();
       }
   ```

##########
File path: pinot-spi/src/main/java/org/apache/pinot/spi/config/table/TableConfig.java
##########
@@ -62,6 +63,9 @@
   @JsonPropertyDescription(value = "The type of the table (OFFLINE|REALTIME) (mandatory)")
   private final TableType _tableType;
 
+  @JsonPropertyDescription("Indicates whether the table is a dimension table or not")
+  private final String _isDimTable;

Review comment:
       Store boolean instead of String

##########
File path: pinot-spi/src/main/java/org/apache/pinot/spi/config/table/TableConfig.java
##########
@@ -142,6 +148,11 @@ public TableType getTableType() {
     return _tableType;
   }
 
+  @JsonProperty(IS_DIM_TABLE_KEY)
+  public String getIsDimTable() {

Review comment:
       ```suggestion
     public boolean isDimTable() {
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] yupeng9 commented on a change in pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
yupeng9 commented on a change in pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#discussion_r529762948



##########
File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/segment/OfflineSegmentAssignment.java
##########
@@ -133,6 +135,10 @@ private void checkReplication(InstancePartitions instancePartitions) {
   private List<String> assignSegment(String segmentName, Map<String, Map<String, String>> currentAssignment,
       InstancePartitions instancePartitions) {
     int numReplicaGroups = instancePartitions.getNumReplicaGroups();
+    if (_isDimTable) {
+

Review comment:
       empty line

##########
File path: pinot-spi/src/main/java/org/apache/pinot/spi/config/table/TableConfig.java
##########
@@ -62,6 +63,9 @@
   @JsonPropertyDescription(value = "The type of the table (OFFLINE|REALTIME) (mandatory)")
   private final TableType _tableType;
 
+  @JsonPropertyDescription("Indicates weather the table is a dimension table or not")

Review comment:
       whether

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/data/manager/TableDataManagerType.java
##########
@@ -0,0 +1,23 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.data.manager;
+
+public enum TableDataManagerType {
+    OFFLINE, REALTIME, DIMENSION

Review comment:
       add some javadoc explaining that `DIMENSION` is a special type of OFFLINE

##########
File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/segment/SegmentAssignmentUtils.java
##########
@@ -87,6 +87,15 @@ private SegmentAssignmentUtils() {
     return instances;
   }
 
+  static List<String> getInstancesForDimTable(InstancePartitions instancePartitions) {
+    Preconditions
+            .checkState(instancePartitions.getNumReplicaGroups() == 1 && instancePartitions.getNumPartitions() == 1,
+                    "Instance partitions: %s should contain 1 replica and 1 partition for a dim table",

Review comment:
       why only 1 replica? I thought it will be on every host, so it shall be as many as the hosts?

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/data/manager/config/TableDataManagerConfig.java
##########
@@ -79,7 +81,10 @@ public static TableDataManagerConfig getDefaultHelixTableDataManagerConfig(
   public void overrideConfigs(@Nonnull TableConfig tableConfig) {
     // Override table level configs
 
-    // Currently we do not override any table level configs into TableDataManagerConfig
+    if (tableConfig.getIsDimTable() != null && tableConfig.getIsDimTable().equals("true")) {

Review comment:
       equalsIgnoreCase

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/data/manager/offline/TableDataManagerProvider.java
##########
@@ -51,13 +52,17 @@ public static TableDataManager getTableDataManager(@Nonnull TableDataManagerConf
       @Nonnull String instanceId, @Nonnull ZkHelixPropertyStore<ZNRecord> propertyStore,
       @Nonnull ServerMetrics serverMetrics, @Nonnull HelixManager helixManager) {
     TableDataManager tableDataManager;
-    switch (TableType.valueOf(tableDataManagerConfig.getTableDataManagerType())) {
+    switch (TableDataManagerType.valueOf(tableDataManagerConfig.getTableDataManagerType())) {
       case OFFLINE:
         tableDataManager = new OfflineTableDataManager();
         break;
       case REALTIME:
         tableDataManager = new RealtimeTableDataManager(_segmentBuildSemaphore);
         break;
+      case DIMENSION:
+        // TODO: Create a DimTableDataManager here when available

Review comment:
       do we need `DimTableDataManager`? or enhance `OfflineTableDataManager `

##########
File path: pinot-tools/src/main/resources/examples/batch/dimBaseballTeams/rawdata/dimBaseballTeams_data.csv
##########
@@ -0,0 +1,52 @@
+teamID,teamName
+ANA,Anaheim Angels
+ARI,Arizona Diamondbacks
+ATL,Atlanta Braves
+BAL,Baltimore Orioles (original- 1901–1902 current- since 1954)
+BOS,Boston Red Caps/Beaneaters (from 1876–1900) or Boston Red Sox (since 1953)
+BOA,Boston Americans (1901–1907)
+BOB,Boston Beaneaters (1901–1906) or Boston Braves/Bees (1912–1952)
+BOD,Boston Doves (1907–1910)
+BOR,Boston Red Sox (1908–1952)
+BOU,Boston Rustlers (1911)
+BKN,Brooklyn Dodgers/Robins/Superbas/Bridegrooms/Grooms/Grays/Atlantics
+CAL,California Angels
+CHC,Chicago Cubs (since 1903)
+CHO,Chicago Orphans (1901–1902)
+CHI,Chicago Orphans/Colts/White Stockings (1876–1900)
+CWS,Chicago White Sox
+CIN,Cincinnati Reds/Red Stockings
+CLE,Cleveland Indians/Naps/Broncos/Bluebirds/Lake Shores
+COL,Colorado Rockies
+DET,Detroit Tigers
+FLA,Florida Marlins
+HOU,Houston Astros/Colt .45s
+KC,Kansas City Athletics (1955–1967) or Kansas City Royals (since 1969)
+LAA,Los Angeles Angels (of Anaheim)
+LAD,Los Angeles Dodgers
+LA,Los Angeles Dodgers (1958–1961 1965–2004)
+MIA,Miami Marlins
+MIL,Milwaukee Brewers (original 1901) or Milwaukee Braves or Milwaukee Brewers (current since 1970)
+MIN,Minnesota Twins
+MTL,Montreal Expos
+NY,New York Gothams/Giants (1883–1902) or New York Yankees (1958–1961)
+NYG,New York Giants/Gothams
+NYM,New York Mets
+NYY,New York Yankees
+NYH,New York Highlanders
+OAK,Oakland Athletics
+PHA,Philadelphia Athletics
+PHI,Philadelphia Phillies/Quakers (1883–1900 1955–present)
+PHP,Philadelphia Phillies (1901–1942 1945–1954)
+PHB,Philadelphia Blue Jays (1943–1944)
+PIT,Pittsburgh Pirates/Alleghenys
+SD,San Diego Padres
+SEA,Seattle Mariners (since 1977) or Seattle Pilots (1969)
+SF,San Francisco Giants
+SLB,St. Louis Browns (AL)
+SLC,St. Louis Cardinals (1902–1953)
+STL,St. Louis Cardinals/Perfectos/Browns/Brown Stockings
+TB,Tampa Bay (Devil) Rays
+TEX,Texas Rangers
+TOR,Toronto Blue Jays
+WSH,Washington Senators (original- 1901–1960) expansion- 1961–1971) or Washington Nationals (since 2005)

Review comment:
       newline

##########
File path: pinot-tools/src/main/resources/examples/batch/dimBaseballTeams/dimBaseballTeams_offline_table_config.json
##########
@@ -0,0 +1,23 @@
+{
+  "tableName": "dimBaseballTeams",
+  "tableType": "OFFLINE",
+  "isDimTable": "true",

Review comment:
       I think we either have `tableType`:`DIMENSION` or `"isDimTable": "true"`, to be consistent with your `TableDataManagerType`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] dharakk commented on a change in pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
dharakk commented on a change in pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#discussion_r534650637



##########
File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/segment/OfflineDimTableSegmentAssignment.java
##########
@@ -0,0 +1,80 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.controller.helix.core.assignment.segment;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.configuration.Configuration;
+import org.apache.helix.HelixManager;
+import org.apache.pinot.common.assignment.InstancePartitions;
+import org.apache.pinot.common.tier.Tier;
+import org.apache.pinot.common.utils.CommonConstants;
+import org.apache.pinot.common.utils.config.TagNameUtils;
+import org.apache.pinot.common.utils.helix.HelixHelper;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.TenantConfig;
+import org.apache.pinot.spi.config.table.assignment.InstancePartitionsType;
+
+import javax.annotation.Nullable;
+
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.TreeMap;
+
+public class OfflineDimTableSegmentAssignment implements SegmentAssignment {
+
+    private HelixManager _helixManager;
+    private String _offlineTableName;
+    private boolean _isDimTable;
+    private TenantConfig _tenantConfig;
+
+    @Override
+    public void init(HelixManager helixManager, TableConfig tableConfig) {
+        _helixManager = helixManager;
+        _offlineTableName = tableConfig.getTableName();
+        _tenantConfig = tableConfig.getTenantConfig();
+        _isDimTable = Boolean.parseBoolean(tableConfig.getIsDimTable());
+        Preconditions.checkState(_isDimTable, "Not a dimension table: %s" + _offlineTableName);
+    }
+
+    @Override
+    public List<String> assignSegment(String segmentName, Map<String, Map<String, String>> currentAssignment, Map<InstancePartitionsType, InstancePartitions> instancePartitionsMap) {
+        String serverTag = TagNameUtils.extractOfflineServerTag(_tenantConfig);
+
+        List<String> instances = HelixHelper.getInstancesWithTag(_helixManager, serverTag);
+        int numInstances = instances.size();
+        Preconditions.checkState(numInstances > 0, "No instance found with tag: %s", serverTag);
+
+        // Sort the instances and rotate the list based on the table name

Review comment:
       I was taking my cue from other places where we were getting all the instances under a tag, but you are right there is no need here, removed it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] yupeng9 commented on a change in pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
yupeng9 commented on a change in pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#discussion_r539685992



##########
File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/segment/OfflineDimTableSegmentAssignment.java
##########
@@ -0,0 +1,93 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.controller.helix.core.assignment.segment;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.configuration.Configuration;
+import org.apache.helix.HelixManager;
+import org.apache.pinot.common.assignment.InstancePartitions;
+import org.apache.pinot.common.tier.Tier;
+import org.apache.pinot.common.utils.CommonConstants;
+import org.apache.pinot.common.utils.config.TagNameUtils;
+import org.apache.pinot.common.utils.helix.HelixHelper;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.TenantConfig;
+import org.apache.pinot.spi.config.table.assignment.InstancePartitionsType;
+
+import javax.annotation.Nullable;
+
+import java.util.List;
+import java.util.Map;
+import java.util.TreeMap;
+
+
+/**
+ * Segment assignment for an offline dimension table.
+ * <ul>
+ *   <li>
+ *     <p>This segment assignment strategy is used when {@link TableConfig#IS_DIM_TABLE_KEY}is
+ *     set to "true".</p>
+ *   </li>
+ *   <li>
+ *     <p>For a dimension table we assign the segment to all the hosts Thus for this assignment

Review comment:
       Thus -> . Thus

##########
File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/segment/OfflineDimTableSegmentAssignment.java
##########
@@ -0,0 +1,93 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.controller.helix.core.assignment.segment;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.configuration.Configuration;
+import org.apache.helix.HelixManager;
+import org.apache.pinot.common.assignment.InstancePartitions;
+import org.apache.pinot.common.tier.Tier;
+import org.apache.pinot.common.utils.CommonConstants;
+import org.apache.pinot.common.utils.config.TagNameUtils;
+import org.apache.pinot.common.utils.helix.HelixHelper;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.TenantConfig;
+import org.apache.pinot.spi.config.table.assignment.InstancePartitionsType;
+
+import javax.annotation.Nullable;
+
+import java.util.List;
+import java.util.Map;
+import java.util.TreeMap;
+
+
+/**
+ * Segment assignment for an offline dimension table.
+ * <ul>
+ *   <li>
+ *     <p>This segment assignment strategy is used when {@link TableConfig#IS_DIM_TABLE_KEY}is
+ *     set to "true".</p>
+ *   </li>
+ *   <li>
+ *     <p>For a dimension table we assign the segment to all the hosts Thus for this assignment
+ *     strategy we simply return all the hosts under a given tag as the assigned hosts for
+ *     a given segment.</p>
+ *   </li>
+ * </ul>
+ */
+public class OfflineDimTableSegmentAssignment implements SegmentAssignment {
+
+  private HelixManager _helixManager;

Review comment:
       final

##########
File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/segment/OfflineDimTableSegmentAssignment.java
##########
@@ -0,0 +1,93 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.controller.helix.core.assignment.segment;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.configuration.Configuration;
+import org.apache.helix.HelixManager;
+import org.apache.pinot.common.assignment.InstancePartitions;
+import org.apache.pinot.common.tier.Tier;
+import org.apache.pinot.common.utils.CommonConstants;
+import org.apache.pinot.common.utils.config.TagNameUtils;
+import org.apache.pinot.common.utils.helix.HelixHelper;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.TenantConfig;
+import org.apache.pinot.spi.config.table.assignment.InstancePartitionsType;
+
+import javax.annotation.Nullable;
+
+import java.util.List;
+import java.util.Map;
+import java.util.TreeMap;
+
+
+/**
+ * Segment assignment for an offline dimension table.
+ * <ul>
+ *   <li>
+ *     <p>This segment assignment strategy is used when {@link TableConfig#IS_DIM_TABLE_KEY}is
+ *     set to "true".</p>
+ *   </li>
+ *   <li>
+ *     <p>For a dimension table we assign the segment to all the hosts Thus for this assignment
+ *     strategy we simply return all the hosts under a given tag as the assigned hosts for
+ *     a given segment.</p>
+ *   </li>
+ * </ul>
+ */
+public class OfflineDimTableSegmentAssignment implements SegmentAssignment {
+
+  private HelixManager _helixManager;
+  private String _offlineTableName;
+  private TenantConfig _tenantConfig;

Review comment:
       final

##########
File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/segment/OfflineDimTableSegmentAssignment.java
##########
@@ -0,0 +1,93 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.controller.helix.core.assignment.segment;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.configuration.Configuration;
+import org.apache.helix.HelixManager;
+import org.apache.pinot.common.assignment.InstancePartitions;
+import org.apache.pinot.common.tier.Tier;
+import org.apache.pinot.common.utils.CommonConstants;
+import org.apache.pinot.common.utils.config.TagNameUtils;
+import org.apache.pinot.common.utils.helix.HelixHelper;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.TenantConfig;
+import org.apache.pinot.spi.config.table.assignment.InstancePartitionsType;
+
+import javax.annotation.Nullable;
+
+import java.util.List;
+import java.util.Map;
+import java.util.TreeMap;
+
+
+/**
+ * Segment assignment for an offline dimension table.
+ * <ul>
+ *   <li>
+ *     <p>This segment assignment strategy is used when {@link TableConfig#IS_DIM_TABLE_KEY}is
+ *     set to "true".</p>
+ *   </li>
+ *   <li>
+ *     <p>For a dimension table we assign the segment to all the hosts Thus for this assignment
+ *     strategy we simply return all the hosts under a given tag as the assigned hosts for
+ *     a given segment.</p>
+ *   </li>
+ * </ul>
+ */
+public class OfflineDimTableSegmentAssignment implements SegmentAssignment {
+
+  private HelixManager _helixManager;
+  private String _offlineTableName;
+  private TenantConfig _tenantConfig;
+
+  @Override
+  public void init(HelixManager helixManager, TableConfig tableConfig) {
+    _helixManager = helixManager;
+    _offlineTableName = tableConfig.getTableName();
+    _tenantConfig = tableConfig.getTenantConfig();
+    Preconditions.checkState(tableConfig.isDimTable(), "Not a dimension table: %s" + _offlineTableName);
+  }
+
+  @Override
+  public List<String> assignSegment(String segmentName, Map<String, Map<String, String>> currentAssignment,
+      Map<InstancePartitionsType, InstancePartitions> instancePartitionsMap) {
+    String serverTag = TagNameUtils.extractOfflineServerTag(_tenantConfig);
+
+    List<String> instances = HelixHelper.getInstancesWithTag(_helixManager, serverTag);
+    int numInstances = instances.size();
+    Preconditions.checkState(numInstances > 0, "No instance found with tag: %s", serverTag);
+
+    return instances;
+  }
+
+  @Override
+  public Map<String, Map<String, String>> rebalanceTable(Map<String, Map<String, String>> currentAssignment,
+      Map<InstancePartitionsType, InstancePartitions> instancePartitionsMap, @Nullable List<Tier> sortedTiers,
+      @Nullable Map<String, InstancePartitions> tierInstancePartitionsMap, Configuration config) {
+    String serverTag = TagNameUtils.extractOfflineServerTag(_tenantConfig);
+    List<String> instances = HelixHelper.getInstancesWithTag(_helixManager, serverTag);
+    Map<String, Map<String, String>> newAssignment = new TreeMap<>();
+    for (String segment : currentAssignment.keySet()) {
+      newAssignment.put(segment, SegmentAssignmentUtils
+          .getInstanceStateMap(instances, CommonConstants.Helix.StateModel.SegmentStateModel.ONLINE));

Review comment:
       What do we do about the offline ones? Will we assign segments to them once they are back online?

##########
File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/segment/OfflineDimTableSegmentAssignment.java
##########
@@ -0,0 +1,93 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.controller.helix.core.assignment.segment;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.configuration.Configuration;
+import org.apache.helix.HelixManager;
+import org.apache.pinot.common.assignment.InstancePartitions;
+import org.apache.pinot.common.tier.Tier;
+import org.apache.pinot.common.utils.CommonConstants;
+import org.apache.pinot.common.utils.config.TagNameUtils;
+import org.apache.pinot.common.utils.helix.HelixHelper;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.TenantConfig;
+import org.apache.pinot.spi.config.table.assignment.InstancePartitionsType;
+
+import javax.annotation.Nullable;
+
+import java.util.List;
+import java.util.Map;
+import java.util.TreeMap;
+
+
+/**
+ * Segment assignment for an offline dimension table.
+ * <ul>
+ *   <li>
+ *     <p>This segment assignment strategy is used when {@link TableConfig#IS_DIM_TABLE_KEY}is
+ *     set to "true".</p>
+ *   </li>
+ *   <li>
+ *     <p>For a dimension table we assign the segment to all the hosts Thus for this assignment
+ *     strategy we simply return all the hosts under a given tag as the assigned hosts for
+ *     a given segment.</p>
+ *   </li>
+ * </ul>
+ */
+public class OfflineDimTableSegmentAssignment implements SegmentAssignment {
+
+  private HelixManager _helixManager;
+  private String _offlineTableName;
+  private TenantConfig _tenantConfig;
+
+  @Override
+  public void init(HelixManager helixManager, TableConfig tableConfig) {
+    _helixManager = helixManager;
+    _offlineTableName = tableConfig.getTableName();
+    _tenantConfig = tableConfig.getTenantConfig();
+    Preconditions.checkState(tableConfig.isDimTable(), "Not a dimension table: %s" + _offlineTableName);

Review comment:
       nit: precondition goes to the top

##########
File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/segment/OfflineDimTableSegmentAssignment.java
##########
@@ -0,0 +1,93 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.controller.helix.core.assignment.segment;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.configuration.Configuration;
+import org.apache.helix.HelixManager;
+import org.apache.pinot.common.assignment.InstancePartitions;
+import org.apache.pinot.common.tier.Tier;
+import org.apache.pinot.common.utils.CommonConstants;
+import org.apache.pinot.common.utils.config.TagNameUtils;
+import org.apache.pinot.common.utils.helix.HelixHelper;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.TenantConfig;
+import org.apache.pinot.spi.config.table.assignment.InstancePartitionsType;
+
+import javax.annotation.Nullable;
+
+import java.util.List;
+import java.util.Map;
+import java.util.TreeMap;
+
+
+/**
+ * Segment assignment for an offline dimension table.
+ * <ul>
+ *   <li>
+ *     <p>This segment assignment strategy is used when {@link TableConfig#IS_DIM_TABLE_KEY}is
+ *     set to "true".</p>
+ *   </li>
+ *   <li>
+ *     <p>For a dimension table we assign the segment to all the hosts Thus for this assignment
+ *     strategy we simply return all the hosts under a given tag as the assigned hosts for
+ *     a given segment.</p>
+ *   </li>
+ * </ul>
+ */
+public class OfflineDimTableSegmentAssignment implements SegmentAssignment {
+
+  private HelixManager _helixManager;
+  private String _offlineTableName;
+  private TenantConfig _tenantConfig;
+
+  @Override
+  public void init(HelixManager helixManager, TableConfig tableConfig) {
+    _helixManager = helixManager;
+    _offlineTableName = tableConfig.getTableName();
+    _tenantConfig = tableConfig.getTenantConfig();
+    Preconditions.checkState(tableConfig.isDimTable(), "Not a dimension table: %s" + _offlineTableName);
+  }
+
+  @Override
+  public List<String> assignSegment(String segmentName, Map<String, Map<String, String>> currentAssignment,
+      Map<InstancePartitionsType, InstancePartitions> instancePartitionsMap) {
+    String serverTag = TagNameUtils.extractOfflineServerTag(_tenantConfig);
+
+    List<String> instances = HelixHelper.getInstancesWithTag(_helixManager, serverTag);
+    int numInstances = instances.size();
+    Preconditions.checkState(numInstances > 0, "No instance found with tag: %s", serverTag);

Review comment:
       I think empty instance is valid? Shall this be a warning?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] dharakk commented on a change in pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
dharakk commented on a change in pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#discussion_r533165128



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/data/manager/offline/TableDataManagerProvider.java
##########
@@ -51,13 +52,17 @@ public static TableDataManager getTableDataManager(@Nonnull TableDataManagerConf
       @Nonnull String instanceId, @Nonnull ZkHelixPropertyStore<ZNRecord> propertyStore,
       @Nonnull ServerMetrics serverMetrics, @Nonnull HelixManager helixManager) {
     TableDataManager tableDataManager;
-    switch (TableType.valueOf(tableDataManagerConfig.getTableDataManagerType())) {
+    switch (TableDataManagerType.valueOf(tableDataManagerConfig.getTableDataManagerType())) {
       case OFFLINE:
         tableDataManager = new OfflineTableDataManager();
         break;
       case REALTIME:
         tableDataManager = new RealtimeTableDataManager(_segmentBuildSemaphore);
         break;
+      case DIMENSION:
+        // TODO: Create a DimTableDataManager here when available

Review comment:
       `DimTableManager` will have a lot of specific themes related to preloading the table and also storing the dim table static map, thus it will muddy the `OfflineTableDataManager` unnecessarily. Thus, we are thinking of extending the `OfflineTableDataManager` in the new manager class to implement the extra functionality. We can discuss this in following PRs as well.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] codecov-io commented on pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
codecov-io commented on pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#issuecomment-732496192


   # [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6286?src=pr&el=h1) Report
   > Merging [#6286](https://codecov.io/gh/apache/incubator-pinot/pull/6286?src=pr&el=desc) (a13ebf2) into [master](https://codecov.io/gh/apache/incubator-pinot/commit/1beaab59b73f26c4e35f3b9bc856b03806cddf5a?el=desc) (1beaab5) will **decrease** coverage by `2.31%`.
   > The diff coverage is `45.12%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-pinot/pull/6286/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz)](https://codecov.io/gh/apache/incubator-pinot/pull/6286?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #6286      +/-   ##
   ==========================================
   - Coverage   66.44%   64.13%   -2.32%     
   ==========================================
     Files        1075     1251     +176     
     Lines       54773    59189    +4416     
     Branches     8168     8781     +613     
   ==========================================
   + Hits        36396    37961    +1565     
   - Misses      15700    18454    +2754     
   - Partials     2677     2774      +97     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | unittests | `64.13% <45.12%> (?)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-pinot/pull/6286?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [...e/pinot/broker/api/resources/PinotBrokerDebug.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvYXBpL3Jlc291cmNlcy9QaW5vdEJyb2tlckRlYnVnLmphdmE=) | `0.00% <0.00%> (-79.32%)` | :arrow_down: |
   | [...ot/broker/broker/AllowAllAccessControlFactory.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvYnJva2VyL0FsbG93QWxsQWNjZXNzQ29udHJvbEZhY3RvcnkuamF2YQ==) | `71.42% <ø> (-28.58%)` | :arrow_down: |
   | [.../helix/BrokerUserDefinedMessageHandlerFactory.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvYnJva2VyL2hlbGl4L0Jyb2tlclVzZXJEZWZpbmVkTWVzc2FnZUhhbmRsZXJGYWN0b3J5LmphdmE=) | `33.96% <0.00%> (-32.71%)` | :arrow_down: |
   | [...ker/routing/instanceselector/InstanceSelector.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvcm91dGluZy9pbnN0YW5jZXNlbGVjdG9yL0luc3RhbmNlU2VsZWN0b3IuamF2YQ==) | `100.00% <ø> (ø)` | |
   | [...ava/org/apache/pinot/client/AbstractResultSet.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtY2xpZW50cy9waW5vdC1qYXZhLWNsaWVudC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY2xpZW50L0Fic3RyYWN0UmVzdWx0U2V0LmphdmE=) | `66.66% <0.00%> (+9.52%)` | :arrow_up: |
   | [.../main/java/org/apache/pinot/client/Connection.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtY2xpZW50cy9waW5vdC1qYXZhLWNsaWVudC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY2xpZW50L0Nvbm5lY3Rpb24uamF2YQ==) | `35.55% <0.00%> (-13.29%)` | :arrow_down: |
   | [...inot/client/JsonAsyncHttpPinotClientTransport.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtY2xpZW50cy9waW5vdC1qYXZhLWNsaWVudC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY2xpZW50L0pzb25Bc3luY0h0dHBQaW5vdENsaWVudFRyYW5zcG9ydC5qYXZh) | `10.90% <0.00%> (-51.10%)` | :arrow_down: |
   | [...not/common/assignment/InstancePartitionsUtils.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vYXNzaWdubWVudC9JbnN0YW5jZVBhcnRpdGlvbnNVdGlscy5qYXZh) | `73.80% <ø> (+0.63%)` | :arrow_up: |
   | [.../apache/pinot/common/exception/QueryException.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vZXhjZXB0aW9uL1F1ZXJ5RXhjZXB0aW9uLmphdmE=) | `90.27% <ø> (+5.55%)` | :arrow_up: |
   | [...pinot/common/function/AggregationFunctionType.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vZnVuY3Rpb24vQWdncmVnYXRpb25GdW5jdGlvblR5cGUuamF2YQ==) | `98.27% <ø> (-1.73%)` | :arrow_down: |
   | ... and [1096 more](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6286?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6286?src=pr&el=footer). Last update [fa7b0e4...a13ebf2](https://codecov.io/gh/apache/incubator-pinot/pull/6286?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#discussion_r539720325



##########
File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/segment/OfflineDimTableSegmentAssignment.java
##########
@@ -0,0 +1,93 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.controller.helix.core.assignment.segment;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.configuration.Configuration;
+import org.apache.helix.HelixManager;
+import org.apache.pinot.common.assignment.InstancePartitions;
+import org.apache.pinot.common.tier.Tier;
+import org.apache.pinot.common.utils.CommonConstants;
+import org.apache.pinot.common.utils.config.TagNameUtils;
+import org.apache.pinot.common.utils.helix.HelixHelper;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.TenantConfig;
+import org.apache.pinot.spi.config.table.assignment.InstancePartitionsType;
+
+import javax.annotation.Nullable;
+
+import java.util.List;
+import java.util.Map;
+import java.util.TreeMap;
+
+
+/**
+ * Segment assignment for an offline dimension table.
+ * <ul>
+ *   <li>
+ *     <p>This segment assignment strategy is used when {@link TableConfig#IS_DIM_TABLE_KEY}is
+ *     set to "true".</p>
+ *   </li>
+ *   <li>
+ *     <p>For a dimension table we assign the segment to all the hosts Thus for this assignment
+ *     strategy we simply return all the hosts under a given tag as the assigned hosts for
+ *     a given segment.</p>
+ *   </li>
+ * </ul>
+ */
+public class OfflineDimTableSegmentAssignment implements SegmentAssignment {
+
+  private HelixManager _helixManager;
+  private String _offlineTableName;
+  private TenantConfig _tenantConfig;
+
+  @Override
+  public void init(HelixManager helixManager, TableConfig tableConfig) {
+    _helixManager = helixManager;
+    _offlineTableName = tableConfig.getTableName();
+    _tenantConfig = tableConfig.getTenantConfig();
+    Preconditions.checkState(tableConfig.isDimTable(), "Not a dimension table: %s" + _offlineTableName);
+  }
+
+  @Override
+  public List<String> assignSegment(String segmentName, Map<String, Map<String, String>> currentAssignment,
+      Map<InstancePartitionsType, InstancePartitions> instancePartitionsMap) {
+    String serverTag = TagNameUtils.extractOfflineServerTag(_tenantConfig);
+
+    List<String> instances = HelixHelper.getInstancesWithTag(_helixManager, serverTag);
+    int numInstances = instances.size();
+    Preconditions.checkState(numInstances > 0, "No instance found with tag: %s", serverTag);

Review comment:
       We don't allow empty instances. When creating the table, we check if there are instances for this table




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on a change in pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#discussion_r533579069



##########
File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/segment/OfflineDimTableSegmentAssignment.java
##########
@@ -0,0 +1,80 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.controller.helix.core.assignment.segment;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.configuration.Configuration;
+import org.apache.helix.HelixManager;
+import org.apache.pinot.common.assignment.InstancePartitions;
+import org.apache.pinot.common.tier.Tier;
+import org.apache.pinot.common.utils.CommonConstants;
+import org.apache.pinot.common.utils.config.TagNameUtils;
+import org.apache.pinot.common.utils.helix.HelixHelper;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.TenantConfig;
+import org.apache.pinot.spi.config.table.assignment.InstancePartitionsType;
+
+import javax.annotation.Nullable;
+
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.TreeMap;
+
+public class OfflineDimTableSegmentAssignment implements SegmentAssignment {
+
+    private HelixManager _helixManager;

Review comment:
       Your indentation is off. Can you please import pinot coding conventions?

##########
File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/segment/OfflineDimTableSegmentAssignment.java
##########
@@ -0,0 +1,80 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.controller.helix.core.assignment.segment;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.configuration.Configuration;
+import org.apache.helix.HelixManager;
+import org.apache.pinot.common.assignment.InstancePartitions;
+import org.apache.pinot.common.tier.Tier;
+import org.apache.pinot.common.utils.CommonConstants;
+import org.apache.pinot.common.utils.config.TagNameUtils;
+import org.apache.pinot.common.utils.helix.HelixHelper;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.TenantConfig;
+import org.apache.pinot.spi.config.table.assignment.InstancePartitionsType;
+
+import javax.annotation.Nullable;
+
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.TreeMap;
+
+public class OfflineDimTableSegmentAssignment implements SegmentAssignment {
+
+    private HelixManager _helixManager;
+    private String _offlineTableName;
+    private boolean _isDimTable;
+    private TenantConfig _tenantConfig;
+
+    @Override
+    public void init(HelixManager helixManager, TableConfig tableConfig) {
+        _helixManager = helixManager;
+        _offlineTableName = tableConfig.getTableName();
+        _tenantConfig = tableConfig.getTenantConfig();
+        _isDimTable = Boolean.parseBoolean(tableConfig.getIsDimTable());
+        Preconditions.checkState(_isDimTable, "Not a dimension table: %s" + _offlineTableName);
+    }
+
+    @Override
+    public List<String> assignSegment(String segmentName, Map<String, Map<String, String>> currentAssignment, Map<InstancePartitionsType, InstancePartitions> instancePartitionsMap) {
+        String serverTag = TagNameUtils.extractOfflineServerTag(_tenantConfig);
+
+        List<String> instances = HelixHelper.getInstancesWithTag(_helixManager, serverTag);
+        int numInstances = instances.size();
+        Preconditions.checkState(numInstances > 0, "No instance found with tag: %s", serverTag);
+
+        // Sort the instances and rotate the list based on the table name

Review comment:
       I don't understand why we sort the instances and rotate if we are returning all instances for assignment. Can you add some comments as to why this is being done?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] yupeng9 commented on pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
yupeng9 commented on pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#issuecomment-742083477


   @dharakk FYI It will be nice to keep the commit history so we can review the incremental changes


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#discussion_r539558913



##########
File path: pinot-common/src/main/java/org/apache/pinot/common/utils/config/TableConfigUtils.java
##########
@@ -21,6 +21,7 @@
 import com.fasterxml.jackson.core.JsonProcessingException;
 import com.fasterxml.jackson.core.type.TypeReference;
 import com.google.common.base.Preconditions;
+import com.sun.org.apache.xpath.internal.operations.Bool;

Review comment:
       Wrong import

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/data/manager/config/TableDataManagerConfig.java
##########
@@ -35,6 +35,7 @@
   private static final String TABLE_DATA_MANAGER_DATA_DIRECTORY = "directory";
   private static final String TABLE_DATA_MANAGER_CONSUMER_DIRECTORY = "consumerDirectory";
   private static final String TABLE_DATA_MANAGER_NAME = "name";
+  private static final String TABLE_IS_DIMENSION = "isDimTable";

Review comment:
       The config in this class is for server specific config (the location of the data directory etc.). We should not keep the `isDimTable()` info in this class, it should be in table config only. Also, I don't think `isDimTable()` is used




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] codecov-io edited a comment on pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#issuecomment-732496192


   # [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6286?src=pr&el=h1) Report
   > Merging [#6286](https://codecov.io/gh/apache/incubator-pinot/pull/6286?src=pr&el=desc) (f0071f1) into [master](https://codecov.io/gh/apache/incubator-pinot/commit/1beaab59b73f26c4e35f3b9bc856b03806cddf5a?el=desc) (1beaab5) will **decrease** coverage by `1.08%`.
   > The diff coverage is `45.12%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-pinot/pull/6286/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz)](https://codecov.io/gh/apache/incubator-pinot/pull/6286?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #6286      +/-   ##
   ==========================================
   - Coverage   66.44%   65.36%   -1.09%     
   ==========================================
     Files        1075     1282     +207     
     Lines       54773    61818    +7045     
     Branches     8168     8982     +814     
   ==========================================
   + Hits        36396    40408    +4012     
   - Misses      15700    18522    +2822     
   - Partials     2677     2888     +211     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | unittests | `65.36% <45.12%> (?)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-pinot/pull/6286?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [...e/pinot/broker/api/resources/PinotBrokerDebug.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvYXBpL3Jlc291cmNlcy9QaW5vdEJyb2tlckRlYnVnLmphdmE=) | `0.00% <0.00%> (-79.32%)` | :arrow_down: |
   | [...ot/broker/broker/AllowAllAccessControlFactory.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvYnJva2VyL0FsbG93QWxsQWNjZXNzQ29udHJvbEZhY3RvcnkuamF2YQ==) | `71.42% <ø> (-28.58%)` | :arrow_down: |
   | [.../helix/BrokerUserDefinedMessageHandlerFactory.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvYnJva2VyL2hlbGl4L0Jyb2tlclVzZXJEZWZpbmVkTWVzc2FnZUhhbmRsZXJGYWN0b3J5LmphdmE=) | `33.96% <0.00%> (-32.71%)` | :arrow_down: |
   | [...ker/routing/instanceselector/InstanceSelector.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvcm91dGluZy9pbnN0YW5jZXNlbGVjdG9yL0luc3RhbmNlU2VsZWN0b3IuamF2YQ==) | `100.00% <ø> (ø)` | |
   | [...ava/org/apache/pinot/client/AbstractResultSet.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtY2xpZW50cy9waW5vdC1qYXZhLWNsaWVudC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY2xpZW50L0Fic3RyYWN0UmVzdWx0U2V0LmphdmE=) | `66.66% <0.00%> (+9.52%)` | :arrow_up: |
   | [.../main/java/org/apache/pinot/client/Connection.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtY2xpZW50cy9waW5vdC1qYXZhLWNsaWVudC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY2xpZW50L0Nvbm5lY3Rpb24uamF2YQ==) | `35.55% <0.00%> (-13.29%)` | :arrow_down: |
   | [...inot/client/JsonAsyncHttpPinotClientTransport.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtY2xpZW50cy9waW5vdC1qYXZhLWNsaWVudC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY2xpZW50L0pzb25Bc3luY0h0dHBQaW5vdENsaWVudFRyYW5zcG9ydC5qYXZh) | `10.90% <0.00%> (-51.10%)` | :arrow_down: |
   | [...not/common/assignment/InstancePartitionsUtils.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vYXNzaWdubWVudC9JbnN0YW5jZVBhcnRpdGlvbnNVdGlscy5qYXZh) | `73.80% <ø> (+0.63%)` | :arrow_up: |
   | [.../apache/pinot/common/exception/QueryException.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vZXhjZXB0aW9uL1F1ZXJ5RXhjZXB0aW9uLmphdmE=) | `90.27% <ø> (+5.55%)` | :arrow_up: |
   | [...pinot/common/function/AggregationFunctionType.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vZnVuY3Rpb24vQWdncmVnYXRpb25GdW5jdGlvblR5cGUuamF2YQ==) | `98.27% <ø> (-1.73%)` | :arrow_down: |
   | ... and [1131 more](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6286?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6286?src=pr&el=footer). Last update [fa7b0e4...f0071f1](https://codecov.io/gh/apache/incubator-pinot/pull/6286?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] dharakk commented on a change in pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
dharakk commented on a change in pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#discussion_r533162386



##########
File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/segment/SegmentAssignmentUtils.java
##########
@@ -87,6 +87,15 @@ private SegmentAssignmentUtils() {
     return instances;
   }
 
+  static List<String> getInstancesForDimTable(InstancePartitions instancePartitions) {
+    Preconditions
+            .checkState(instancePartitions.getNumReplicaGroups() == 1 && instancePartitions.getNumPartitions() == 1,
+                    "Instance partitions: %s should contain 1 replica and 1 partition for a dim table",

Review comment:
       Removed this way of gathering hosts and now getting hosts from a given tag directly. In this case the replication number as such will be disregarded and by default we will assign to all the segment to all the hosts




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] chenboat commented on a change in pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
chenboat commented on a change in pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#discussion_r530122861



##########
File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/segment/SegmentAssignmentUtils.java
##########
@@ -87,6 +87,15 @@ private SegmentAssignmentUtils() {
     return instances;
   }
 
+  static List<String> getInstancesForDimTable(InstancePartitions instancePartitions) {

Review comment:
       Add a test?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] chenboat commented on a change in pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
chenboat commented on a change in pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#discussion_r530122778



##########
File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/segment/OfflineSegmentAssignment.java
##########
@@ -133,6 +135,10 @@ private void checkReplication(InstancePartitions instancePartitions) {
   private List<String> assignSegment(String segmentName, Map<String, Map<String, String>> currentAssignment,
       InstancePartitions instancePartitions) {
     int numReplicaGroups = instancePartitions.getNumReplicaGroups();
+    if (_isDimTable) {

Review comment:
       can we add a new subclass of OfflineSegmentAssignment instead of using a boolean check?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] dharakk commented on a change in pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
dharakk commented on a change in pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#discussion_r539100297



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/data/manager/config/TableDataManagerConfig.java
##########
@@ -35,6 +35,7 @@
   private static final String TABLE_DATA_MANAGER_DATA_DIRECTORY = "directory";
   private static final String TABLE_DATA_MANAGER_CONSUMER_DIRECTORY = "consumerDirectory";
   private static final String TABLE_DATA_MANAGER_NAME = "name";
+  private static final String TABLE_IS_DIMENSION = "isDimTable";

Review comment:
       Sorry, I didn't quite get this, can you elaborate?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] chenboat commented on a change in pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
chenboat commented on a change in pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#discussion_r534424189



##########
File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/segment/OfflineDimTableSegmentAssignment.java
##########
@@ -0,0 +1,80 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.controller.helix.core.assignment.segment;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.configuration.Configuration;
+import org.apache.helix.HelixManager;
+import org.apache.pinot.common.assignment.InstancePartitions;
+import org.apache.pinot.common.tier.Tier;
+import org.apache.pinot.common.utils.CommonConstants;
+import org.apache.pinot.common.utils.config.TagNameUtils;
+import org.apache.pinot.common.utils.helix.HelixHelper;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.TenantConfig;
+import org.apache.pinot.spi.config.table.assignment.InstancePartitionsType;
+
+import javax.annotation.Nullable;
+
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.TreeMap;
+
+public class OfflineDimTableSegmentAssignment implements SegmentAssignment {

Review comment:
       Can you add javadoc to this new class? It should explain what is the property of the new DimTableSegmentAssignment? Also its differences with other SegmentAssignment strategies.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on a change in pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#discussion_r535414983



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/data/manager/TableDataManagerType.java
##########
@@ -0,0 +1,23 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.data.manager;
+
+public enum TableDataManagerType {

Review comment:
       Some suggestions: We could use `TableSubtype` maybe. And make it an enum so that we have some room in the future. I am not sure if, for example, a different layout of a larger table can also support joins.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] yupeng9 commented on a change in pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
yupeng9 commented on a change in pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#discussion_r539748989



##########
File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/segment/OfflineDimTableSegmentAssignment.java
##########
@@ -0,0 +1,93 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.controller.helix.core.assignment.segment;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.configuration.Configuration;
+import org.apache.helix.HelixManager;
+import org.apache.pinot.common.assignment.InstancePartitions;
+import org.apache.pinot.common.tier.Tier;
+import org.apache.pinot.common.utils.CommonConstants;
+import org.apache.pinot.common.utils.config.TagNameUtils;
+import org.apache.pinot.common.utils.helix.HelixHelper;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.TenantConfig;
+import org.apache.pinot.spi.config.table.assignment.InstancePartitionsType;
+
+import javax.annotation.Nullable;
+
+import java.util.List;
+import java.util.Map;
+import java.util.TreeMap;
+
+
+/**
+ * Segment assignment for an offline dimension table.
+ * <ul>
+ *   <li>
+ *     <p>This segment assignment strategy is used when {@link TableConfig#IS_DIM_TABLE_KEY}is
+ *     set to "true".</p>
+ *   </li>
+ *   <li>
+ *     <p>For a dimension table we assign the segment to all the hosts Thus for this assignment
+ *     strategy we simply return all the hosts under a given tag as the assigned hosts for
+ *     a given segment.</p>
+ *   </li>
+ * </ul>
+ */
+public class OfflineDimTableSegmentAssignment implements SegmentAssignment {
+
+  private HelixManager _helixManager;
+  private String _offlineTableName;
+  private TenantConfig _tenantConfig;
+
+  @Override
+  public void init(HelixManager helixManager, TableConfig tableConfig) {
+    _helixManager = helixManager;
+    _offlineTableName = tableConfig.getTableName();
+    _tenantConfig = tableConfig.getTenantConfig();
+    Preconditions.checkState(tableConfig.isDimTable(), "Not a dimension table: %s" + _offlineTableName);
+  }
+
+  @Override
+  public List<String> assignSegment(String segmentName, Map<String, Map<String, String>> currentAssignment,
+      Map<InstancePartitionsType, InstancePartitions> instancePartitionsMap) {
+    String serverTag = TagNameUtils.extractOfflineServerTag(_tenantConfig);
+
+    List<String> instances = HelixHelper.getInstancesWithTag(_helixManager, serverTag);
+    int numInstances = instances.size();
+    Preconditions.checkState(numInstances > 0, "No instance found with tag: %s", serverTag);

Review comment:
       I see. But can the assignment happen post-creation? For example, what do you do after adding a new host.

##########
File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/segment/OfflineDimTableSegmentAssignment.java
##########
@@ -0,0 +1,93 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.controller.helix.core.assignment.segment;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.configuration.Configuration;
+import org.apache.helix.HelixManager;
+import org.apache.pinot.common.assignment.InstancePartitions;
+import org.apache.pinot.common.tier.Tier;
+import org.apache.pinot.common.utils.CommonConstants;
+import org.apache.pinot.common.utils.config.TagNameUtils;
+import org.apache.pinot.common.utils.helix.HelixHelper;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.TenantConfig;
+import org.apache.pinot.spi.config.table.assignment.InstancePartitionsType;
+
+import javax.annotation.Nullable;
+
+import java.util.List;
+import java.util.Map;
+import java.util.TreeMap;
+
+
+/**
+ * Segment assignment for an offline dimension table.
+ * <ul>
+ *   <li>
+ *     <p>This segment assignment strategy is used when {@link TableConfig#IS_DIM_TABLE_KEY}is
+ *     set to "true".</p>
+ *   </li>
+ *   <li>
+ *     <p>For a dimension table we assign the segment to all the hosts Thus for this assignment
+ *     strategy we simply return all the hosts under a given tag as the assigned hosts for
+ *     a given segment.</p>
+ *   </li>
+ * </ul>
+ */
+public class OfflineDimTableSegmentAssignment implements SegmentAssignment {
+
+  private HelixManager _helixManager;
+  private String _offlineTableName;
+  private TenantConfig _tenantConfig;
+
+  @Override
+  public void init(HelixManager helixManager, TableConfig tableConfig) {
+    _helixManager = helixManager;
+    _offlineTableName = tableConfig.getTableName();
+    _tenantConfig = tableConfig.getTenantConfig();
+    Preconditions.checkState(tableConfig.isDimTable(), "Not a dimension table: %s" + _offlineTableName);
+  }
+
+  @Override
+  public List<String> assignSegment(String segmentName, Map<String, Map<String, String>> currentAssignment,
+      Map<InstancePartitionsType, InstancePartitions> instancePartitionsMap) {
+    String serverTag = TagNameUtils.extractOfflineServerTag(_tenantConfig);
+
+    List<String> instances = HelixHelper.getInstancesWithTag(_helixManager, serverTag);
+    int numInstances = instances.size();
+    Preconditions.checkState(numInstances > 0, "No instance found with tag: %s", serverTag);
+
+    return instances;
+  }
+
+  @Override
+  public Map<String, Map<String, String>> rebalanceTable(Map<String, Map<String, String>> currentAssignment,
+      Map<InstancePartitionsType, InstancePartitions> instancePartitionsMap, @Nullable List<Tier> sortedTiers,
+      @Nullable Map<String, InstancePartitions> tierInstancePartitionsMap, Configuration config) {
+    String serverTag = TagNameUtils.extractOfflineServerTag(_tenantConfig);
+    List<String> instances = HelixHelper.getInstancesWithTag(_helixManager, serverTag);
+    Map<String, Map<String, String>> newAssignment = new TreeMap<>();
+    for (String segment : currentAssignment.keySet()) {
+      newAssignment.put(segment, SegmentAssignmentUtils
+          .getInstanceStateMap(instances, CommonConstants.Helix.StateModel.SegmentStateModel.ONLINE));

Review comment:
       Makes sense




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] mcvsubbu commented on pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#issuecomment-736947530


   > > > I remember adding support in Helix where you simply mention the replication factor as N and Helix will assign a segment to all nodes in the cluster that match the tag. Can we use that?
   > > 
   > > 
   > > I suggest not to use the helix mechanism (even if it is available) -- for operational ease. "Pinot code controls segment assignment" -- is much easier to reason about, than looking up whether isDimTable is set to know what is going on.
   > 
   > @mcvsubbu I was curious about this suggestion. I think the purpose is to have a declarative way of segment assignment for the DimTable, but leave the actual implementation to the corresponding module, be it Helix or Pinot. Is there a particular reason that we shall not consider Helix? Is it due to reliability or some other considerations?
   
   The idea is two-fold:
   1. It is better we have the same pattern of assigning segments --  that of using a specific segment assignment strategy. Just add another name, and write the code for it, it should be fairly small, as has been demonstrated in this PR.
   2. If Helix (for whatever reason) changes behavior, Pinot will not suffer. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] yupeng9 commented on pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
yupeng9 commented on pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#issuecomment-736765701


   > > I remember adding support in Helix where you simply mention the replication factor as N and Helix will assign a segment to all nodes in the cluster that match the tag. Can we use that?
   > 
   > I suggest not to use the helix mechanism (even if it is available) -- for operational ease. "Pinot code controls segment assignment" -- is much easier to reason about, than looking up whether isDimTable is set to know what is going on.
   
   @mcvsubbu I was curious about this suggestion. I think the purpose is to have a declarative way of segment assignment for the DimTable, but leave the actual implementation to the corresponding module, be it Helix or Pinot. Is there a particular reason that we shall not consider Helix? Is it due to reliability or some other considerations?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] codecov-io edited a comment on pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#issuecomment-732496192


   # [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6286?src=pr&el=h1) Report
   > Merging [#6286](https://codecov.io/gh/apache/incubator-pinot/pull/6286?src=pr&el=desc) (31e3522) into [master](https://codecov.io/gh/apache/incubator-pinot/commit/1beaab59b73f26c4e35f3b9bc856b03806cddf5a?el=desc) (1beaab5) will **decrease** coverage by `1.07%`.
   > The diff coverage is `45.12%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-pinot/pull/6286/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz)](https://codecov.io/gh/apache/incubator-pinot/pull/6286?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #6286      +/-   ##
   ==========================================
   - Coverage   66.44%   65.37%   -1.08%     
   ==========================================
     Files        1075     1286     +211     
     Lines       54773    62061    +7288     
     Branches     8168     9004     +836     
   ==========================================
   + Hits        36396    40571    +4175     
   - Misses      15700    18598    +2898     
   - Partials     2677     2892     +215     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | unittests | `65.37% <45.12%> (?)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-pinot/pull/6286?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [...e/pinot/broker/api/resources/PinotBrokerDebug.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvYXBpL3Jlc291cmNlcy9QaW5vdEJyb2tlckRlYnVnLmphdmE=) | `0.00% <0.00%> (-79.32%)` | :arrow_down: |
   | [...ot/broker/broker/AllowAllAccessControlFactory.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvYnJva2VyL0FsbG93QWxsQWNjZXNzQ29udHJvbEZhY3RvcnkuamF2YQ==) | `71.42% <ø> (-28.58%)` | :arrow_down: |
   | [.../helix/BrokerUserDefinedMessageHandlerFactory.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvYnJva2VyL2hlbGl4L0Jyb2tlclVzZXJEZWZpbmVkTWVzc2FnZUhhbmRsZXJGYWN0b3J5LmphdmE=) | `33.96% <0.00%> (-32.71%)` | :arrow_down: |
   | [...ker/routing/instanceselector/InstanceSelector.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvcm91dGluZy9pbnN0YW5jZXNlbGVjdG9yL0luc3RhbmNlU2VsZWN0b3IuamF2YQ==) | `100.00% <ø> (ø)` | |
   | [...ava/org/apache/pinot/client/AbstractResultSet.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtY2xpZW50cy9waW5vdC1qYXZhLWNsaWVudC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY2xpZW50L0Fic3RyYWN0UmVzdWx0U2V0LmphdmE=) | `66.66% <0.00%> (+9.52%)` | :arrow_up: |
   | [.../main/java/org/apache/pinot/client/Connection.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtY2xpZW50cy9waW5vdC1qYXZhLWNsaWVudC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY2xpZW50L0Nvbm5lY3Rpb24uamF2YQ==) | `35.55% <0.00%> (-13.29%)` | :arrow_down: |
   | [...inot/client/JsonAsyncHttpPinotClientTransport.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtY2xpZW50cy9waW5vdC1qYXZhLWNsaWVudC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY2xpZW50L0pzb25Bc3luY0h0dHBQaW5vdENsaWVudFRyYW5zcG9ydC5qYXZh) | `10.90% <0.00%> (-51.10%)` | :arrow_down: |
   | [...not/common/assignment/InstancePartitionsUtils.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vYXNzaWdubWVudC9JbnN0YW5jZVBhcnRpdGlvbnNVdGlscy5qYXZh) | `73.80% <ø> (+0.63%)` | :arrow_up: |
   | [.../apache/pinot/common/exception/QueryException.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vZXhjZXB0aW9uL1F1ZXJ5RXhjZXB0aW9uLmphdmE=) | `90.27% <ø> (+5.55%)` | :arrow_up: |
   | [...pinot/common/function/AggregationFunctionType.java](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vZnVuY3Rpb24vQWdncmVnYXRpb25GdW5jdGlvblR5cGUuamF2YQ==) | `100.00% <ø> (ø)` | |
   | ... and [1134 more](https://codecov.io/gh/apache/incubator-pinot/pull/6286/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6286?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6286?src=pr&el=footer). Last update [fa7b0e4...31e3522](https://codecov.io/gh/apache/incubator-pinot/pull/6286?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] dharakk commented on a change in pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
dharakk commented on a change in pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#discussion_r533162641



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/data/manager/TableDataManagerType.java
##########
@@ -0,0 +1,23 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.data.manager;
+
+public enum TableDataManagerType {
+    OFFLINE, REALTIME, DIMENSION

Review comment:
       Removed the newly added type, keeping on the boolean `isDimTable` check




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] mcvsubbu commented on pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#issuecomment-732473490


   Can you please add a link to your design doc for join? Thanks


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] yupeng9 commented on a change in pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
yupeng9 commented on a change in pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#discussion_r539749196



##########
File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/segment/OfflineDimTableSegmentAssignment.java
##########
@@ -0,0 +1,93 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.controller.helix.core.assignment.segment;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.configuration.Configuration;
+import org.apache.helix.HelixManager;
+import org.apache.pinot.common.assignment.InstancePartitions;
+import org.apache.pinot.common.tier.Tier;
+import org.apache.pinot.common.utils.CommonConstants;
+import org.apache.pinot.common.utils.config.TagNameUtils;
+import org.apache.pinot.common.utils.helix.HelixHelper;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.TenantConfig;
+import org.apache.pinot.spi.config.table.assignment.InstancePartitionsType;
+
+import javax.annotation.Nullable;
+
+import java.util.List;
+import java.util.Map;
+import java.util.TreeMap;
+
+
+/**
+ * Segment assignment for an offline dimension table.
+ * <ul>
+ *   <li>
+ *     <p>This segment assignment strategy is used when {@link TableConfig#IS_DIM_TABLE_KEY}is
+ *     set to "true".</p>
+ *   </li>
+ *   <li>
+ *     <p>For a dimension table we assign the segment to all the hosts Thus for this assignment
+ *     strategy we simply return all the hosts under a given tag as the assigned hosts for
+ *     a given segment.</p>
+ *   </li>
+ * </ul>
+ */
+public class OfflineDimTableSegmentAssignment implements SegmentAssignment {
+
+  private HelixManager _helixManager;
+  private String _offlineTableName;
+  private TenantConfig _tenantConfig;

Review comment:
       ah, right. I missed that




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] dharakk commented on a change in pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
dharakk commented on a change in pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#discussion_r533165996



##########
File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/segment/SegmentAssignmentUtils.java
##########
@@ -87,6 +87,15 @@ private SegmentAssignmentUtils() {
     return instances;
   }
 
+  static List<String> getInstancesForDimTable(InstancePartitions instancePartitions) {

Review comment:
       Removed this way of gathering hosts, now explicitly returning all the hosts under a given tag as you suggested




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] dharakk commented on a change in pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
dharakk commented on a change in pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#discussion_r533170400



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/data/manager/TableDataManagerType.java
##########
@@ -0,0 +1,23 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.data.manager;
+
+public enum TableDataManagerType {

Review comment:
       I refrained from adding a new table type because of the following two reasons:
   
   1. Since ingestion mode is still offline, it seems appropriate that table type still remain OFFLINE
   2. Adding a new type would be an invasive change and I wasn't sure of the blast radius.
   
   Instead of adding a new `TableDataManagerType` I have now added a boolean similar to how we are identifying the dim table in table config. Even for a future realtime dim table, this approach should work. Does this approach look good?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#discussion_r539720614



##########
File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/segment/OfflineDimTableSegmentAssignment.java
##########
@@ -0,0 +1,93 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.controller.helix.core.assignment.segment;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.configuration.Configuration;
+import org.apache.helix.HelixManager;
+import org.apache.pinot.common.assignment.InstancePartitions;
+import org.apache.pinot.common.tier.Tier;
+import org.apache.pinot.common.utils.CommonConstants;
+import org.apache.pinot.common.utils.config.TagNameUtils;
+import org.apache.pinot.common.utils.helix.HelixHelper;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.TenantConfig;
+import org.apache.pinot.spi.config.table.assignment.InstancePartitionsType;
+
+import javax.annotation.Nullable;
+
+import java.util.List;
+import java.util.Map;
+import java.util.TreeMap;
+
+
+/**
+ * Segment assignment for an offline dimension table.
+ * <ul>
+ *   <li>
+ *     <p>This segment assignment strategy is used when {@link TableConfig#IS_DIM_TABLE_KEY}is
+ *     set to "true".</p>
+ *   </li>
+ *   <li>
+ *     <p>For a dimension table we assign the segment to all the hosts Thus for this assignment
+ *     strategy we simply return all the hosts under a given tag as the assigned hosts for
+ *     a given segment.</p>
+ *   </li>
+ * </ul>
+ */
+public class OfflineDimTableSegmentAssignment implements SegmentAssignment {
+
+  private HelixManager _helixManager;
+  private String _offlineTableName;
+  private TenantConfig _tenantConfig;
+
+  @Override
+  public void init(HelixManager helixManager, TableConfig tableConfig) {
+    _helixManager = helixManager;
+    _offlineTableName = tableConfig.getTableName();
+    _tenantConfig = tableConfig.getTenantConfig();
+    Preconditions.checkState(tableConfig.isDimTable(), "Not a dimension table: %s" + _offlineTableName);
+  }
+
+  @Override
+  public List<String> assignSegment(String segmentName, Map<String, Map<String, String>> currentAssignment,
+      Map<InstancePartitionsType, InstancePartitions> instancePartitionsMap) {
+    String serverTag = TagNameUtils.extractOfflineServerTag(_tenantConfig);
+
+    List<String> instances = HelixHelper.getInstancesWithTag(_helixManager, serverTag);
+    int numInstances = instances.size();
+    Preconditions.checkState(numInstances > 0, "No instance found with tag: %s", serverTag);
+
+    return instances;
+  }
+
+  @Override
+  public Map<String, Map<String, String>> rebalanceTable(Map<String, Map<String, String>> currentAssignment,
+      Map<InstancePartitionsType, InstancePartitions> instancePartitionsMap, @Nullable List<Tier> sortedTiers,
+      @Nullable Map<String, InstancePartitions> tierInstancePartitionsMap, Configuration config) {
+    String serverTag = TagNameUtils.extractOfflineServerTag(_tenantConfig);
+    List<String> instances = HelixHelper.getInstancesWithTag(_helixManager, serverTag);
+    Map<String, Map<String, String>> newAssignment = new TreeMap<>();
+    for (String segment : currentAssignment.keySet()) {
+      newAssignment.put(segment, SegmentAssignmentUtils
+          .getInstanceStateMap(instances, CommonConstants.Helix.StateModel.SegmentStateModel.ONLINE));

Review comment:
       This is ideal state, where we don't have OFFLINE state for the instance




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] yupeng9 commented on pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
yupeng9 commented on pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#issuecomment-736986030


   > > > > I remember adding support in Helix where you simply mention the replication factor as N and Helix will assign a segment to all nodes in the cluster that match the tag. Can we use that?
   > > > 
   > > > 
   > > > I suggest not to use the helix mechanism (even if it is available) -- for operational ease. "Pinot code controls segment assignment" -- is much easier to reason about, than looking up whether isDimTable is set to know what is going on.
   > > 
   > > 
   > > @mcvsubbu I was curious about this suggestion. I think the purpose is to have a declarative way of segment assignment for the DimTable, but leave the actual implementation to the corresponding module, be it Helix or Pinot. Is there a particular reason that we shall not consider Helix? Is it due to reliability or some other considerations?
   > 
   > The idea is two-fold:
   > 
   > 1. It is better we have the same pattern of assigning segments --  that of using a specific segment assignment strategy. Just add another name, and write the code for it, it should be fairly small, as has been demonstrated in this PR.
   > 2. If Helix (for whatever reason) changes behavior, Pinot will not suffer.
   
   Makes sense. Thanks


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] mcvsubbu commented on pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#issuecomment-736695196


   > I remember adding support in Helix where you simply mention the replication factor as N and Helix will assign a segment to all nodes in the cluster that match the tag. Can we use that?
   
   I suggest not to use the helix mechanism (even if it is available) -- for operational ease.  "Pinot code controls segment assignment" --  is  much easier to reason about, than looking up whether isDimTable is set to know what is going on.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] kishoreg commented on pull request #6286: Adding offline dimension table creation and segment assignment

Posted by GitBox <gi...@apache.org>.
kishoreg commented on pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#issuecomment-736692535


   I remember adding support in Helix where you simply mention the replication factor as N and Helix will assign a segment to all nodes in the cluster that match the tag. Can we use that? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org