You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/06/14 21:37:54 UTC

[GitHub] [incubator-druid] jon-wei commented on a change in pull request #7838: Improve IncrementalIndex concurrency scalability

jon-wei commented on a change in pull request #7838: Improve IncrementalIndex concurrency scalability
URL: https://github.com/apache/incubator-druid/pull/7838#discussion_r294002577
 
 

 ##########
 File path: processing/src/main/java/org/apache/druid/segment/incremental/IncrementalIndex.java
 ##########
 @@ -627,98 +624,96 @@ IncrementalIndexRowResult toIncrementalIndexRow(InputRow row)
     if (row.getTimestampFromEpoch() < minTimestamp) {
       throw new IAE("Cannot add row[%s] because it is below the minTimestamp[%s]", row, DateTimes.utc(minTimestamp));
     }
-
     final List<String> rowDimensions = row.getDimensions();
 
-    Object[] dims;
-    List<Object> overflow = null;
+    Map<String, Object> rowDimKeys = new HashMap<>();
     long dimsKeySize = 0;
     List<String> parseExceptionMessages = new ArrayList<>();
-    synchronized (dimensionDescs) {
-      dims = new Object[dimensionDescs.size()];
-      for (String dimension : rowDimensions) {
-        if (Strings.isNullOrEmpty(dimension)) {
-          continue;
-        }
-        boolean wasNewDim = false;
-        ColumnCapabilitiesImpl capabilities;
-        DimensionDesc desc = dimensionDescs.get(dimension);
-        if (desc != null) {
-          capabilities = desc.getCapabilities();
-        } else {
-          wasNewDim = true;
-          capabilities = columnCapabilities.get(dimension);
-          if (capabilities == null) {
-            capabilities = new ColumnCapabilitiesImpl();
-            // For schemaless type discovery, assume everything is a String for now, can change later.
-            capabilities.setType(ValueType.STRING);
-            capabilities.setDictionaryEncoded(true);
-            capabilities.setHasBitmapIndexes(true);
-            columnCapabilities.put(dimension, capabilities);
-          }
-          DimensionHandler handler = DimensionHandlerUtils.getHandlerFromCapabilities(dimension, capabilities, null);
-          desc = addNewDimension(dimension, capabilities, handler);
-        }
-        DimensionHandler handler = desc.getHandler();
-        DimensionIndexer indexer = desc.getIndexer();
-        Object dimsKey = null;
-        try {
-          dimsKey = indexer.processRowValsToUnsortedEncodedKeyComponent(
-              row.getRaw(dimension),
-              true
-          );
+
+    DimensionData prevDimensionData = this.dimensions.get();
+    DimensionData dimensionData = null;
+    for (String dimension : rowDimensions) {
+      if (Strings.isNullOrEmpty(dimension)) {
+        continue;
+      }
+
+      if (rowDimKeys.containsKey(dimension)) {
+        // If the dims map already contains a mapping at this index, it means we have seen this dimension already on this input row.
+        throw new ISE("Dimension[%s] occurred more than once in InputRow", dimension);
 
 Review comment:
   For context, I think this is where that check originated: https://github.com/apache/incubator-druid/issues/63
   
   For the duplicate dimension case, making it permissive seems like it would be fine, adding an entry to `parseExceptionMessages` in `toIncrementalIndexRow` when this happens will ultimately result in the row being ingested, with the `processedWithError` ingestion counter being incremented, which sounds appropriate to me.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org