You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2020/11/24 17:53:28 UTC

[GitHub] [incubator-pinot] yupeng9 commented on a change in pull request #6286: Adding offline dimension table creation and segment assignment

yupeng9 commented on a change in pull request #6286:
URL: https://github.com/apache/incubator-pinot/pull/6286#discussion_r529762948



##########
File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/segment/OfflineSegmentAssignment.java
##########
@@ -133,6 +135,10 @@ private void checkReplication(InstancePartitions instancePartitions) {
   private List<String> assignSegment(String segmentName, Map<String, Map<String, String>> currentAssignment,
       InstancePartitions instancePartitions) {
     int numReplicaGroups = instancePartitions.getNumReplicaGroups();
+    if (_isDimTable) {
+

Review comment:
       empty line

##########
File path: pinot-spi/src/main/java/org/apache/pinot/spi/config/table/TableConfig.java
##########
@@ -62,6 +63,9 @@
   @JsonPropertyDescription(value = "The type of the table (OFFLINE|REALTIME) (mandatory)")
   private final TableType _tableType;
 
+  @JsonPropertyDescription("Indicates weather the table is a dimension table or not")

Review comment:
       whether

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/data/manager/TableDataManagerType.java
##########
@@ -0,0 +1,23 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.data.manager;
+
+public enum TableDataManagerType {
+    OFFLINE, REALTIME, DIMENSION

Review comment:
       add some javadoc explaining that `DIMENSION` is a special type of OFFLINE

##########
File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/segment/SegmentAssignmentUtils.java
##########
@@ -87,6 +87,15 @@ private SegmentAssignmentUtils() {
     return instances;
   }
 
+  static List<String> getInstancesForDimTable(InstancePartitions instancePartitions) {
+    Preconditions
+            .checkState(instancePartitions.getNumReplicaGroups() == 1 && instancePartitions.getNumPartitions() == 1,
+                    "Instance partitions: %s should contain 1 replica and 1 partition for a dim table",

Review comment:
       why only 1 replica? I thought it will be on every host, so it shall be as many as the hosts?

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/data/manager/config/TableDataManagerConfig.java
##########
@@ -79,7 +81,10 @@ public static TableDataManagerConfig getDefaultHelixTableDataManagerConfig(
   public void overrideConfigs(@Nonnull TableConfig tableConfig) {
     // Override table level configs
 
-    // Currently we do not override any table level configs into TableDataManagerConfig
+    if (tableConfig.getIsDimTable() != null && tableConfig.getIsDimTable().equals("true")) {

Review comment:
       equalsIgnoreCase

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/data/manager/offline/TableDataManagerProvider.java
##########
@@ -51,13 +52,17 @@ public static TableDataManager getTableDataManager(@Nonnull TableDataManagerConf
       @Nonnull String instanceId, @Nonnull ZkHelixPropertyStore<ZNRecord> propertyStore,
       @Nonnull ServerMetrics serverMetrics, @Nonnull HelixManager helixManager) {
     TableDataManager tableDataManager;
-    switch (TableType.valueOf(tableDataManagerConfig.getTableDataManagerType())) {
+    switch (TableDataManagerType.valueOf(tableDataManagerConfig.getTableDataManagerType())) {
       case OFFLINE:
         tableDataManager = new OfflineTableDataManager();
         break;
       case REALTIME:
         tableDataManager = new RealtimeTableDataManager(_segmentBuildSemaphore);
         break;
+      case DIMENSION:
+        // TODO: Create a DimTableDataManager here when available

Review comment:
       do we need `DimTableDataManager`? or enhance `OfflineTableDataManager `

##########
File path: pinot-tools/src/main/resources/examples/batch/dimBaseballTeams/rawdata/dimBaseballTeams_data.csv
##########
@@ -0,0 +1,52 @@
+teamID,teamName
+ANA,Anaheim Angels
+ARI,Arizona Diamondbacks
+ATL,Atlanta Braves
+BAL,Baltimore Orioles (original- 1901–1902 current- since 1954)
+BOS,Boston Red Caps/Beaneaters (from 1876–1900) or Boston Red Sox (since 1953)
+BOA,Boston Americans (1901–1907)
+BOB,Boston Beaneaters (1901–1906) or Boston Braves/Bees (1912–1952)
+BOD,Boston Doves (1907–1910)
+BOR,Boston Red Sox (1908–1952)
+BOU,Boston Rustlers (1911)
+BKN,Brooklyn Dodgers/Robins/Superbas/Bridegrooms/Grooms/Grays/Atlantics
+CAL,California Angels
+CHC,Chicago Cubs (since 1903)
+CHO,Chicago Orphans (1901–1902)
+CHI,Chicago Orphans/Colts/White Stockings (1876–1900)
+CWS,Chicago White Sox
+CIN,Cincinnati Reds/Red Stockings
+CLE,Cleveland Indians/Naps/Broncos/Bluebirds/Lake Shores
+COL,Colorado Rockies
+DET,Detroit Tigers
+FLA,Florida Marlins
+HOU,Houston Astros/Colt .45s
+KC,Kansas City Athletics (1955–1967) or Kansas City Royals (since 1969)
+LAA,Los Angeles Angels (of Anaheim)
+LAD,Los Angeles Dodgers
+LA,Los Angeles Dodgers (1958–1961 1965–2004)
+MIA,Miami Marlins
+MIL,Milwaukee Brewers (original 1901) or Milwaukee Braves or Milwaukee Brewers (current since 1970)
+MIN,Minnesota Twins
+MTL,Montreal Expos
+NY,New York Gothams/Giants (1883–1902) or New York Yankees (1958–1961)
+NYG,New York Giants/Gothams
+NYM,New York Mets
+NYY,New York Yankees
+NYH,New York Highlanders
+OAK,Oakland Athletics
+PHA,Philadelphia Athletics
+PHI,Philadelphia Phillies/Quakers (1883–1900 1955–present)
+PHP,Philadelphia Phillies (1901–1942 1945–1954)
+PHB,Philadelphia Blue Jays (1943–1944)
+PIT,Pittsburgh Pirates/Alleghenys
+SD,San Diego Padres
+SEA,Seattle Mariners (since 1977) or Seattle Pilots (1969)
+SF,San Francisco Giants
+SLB,St. Louis Browns (AL)
+SLC,St. Louis Cardinals (1902–1953)
+STL,St. Louis Cardinals/Perfectos/Browns/Brown Stockings
+TB,Tampa Bay (Devil) Rays
+TEX,Texas Rangers
+TOR,Toronto Blue Jays
+WSH,Washington Senators (original- 1901–1960) expansion- 1961–1971) or Washington Nationals (since 2005)

Review comment:
       newline

##########
File path: pinot-tools/src/main/resources/examples/batch/dimBaseballTeams/dimBaseballTeams_offline_table_config.json
##########
@@ -0,0 +1,23 @@
+{
+  "tableName": "dimBaseballTeams",
+  "tableType": "OFFLINE",
+  "isDimTable": "true",

Review comment:
       I think we either have `tableType`:`DIMENSION` or `"isDimTable": "true"`, to be consistent with your `TableDataManagerType`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org