You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2021/11/30 17:38:37 UTC

[GitHub] [pinot] klsince commented on a change in pull request #7820: push JSON Path evaluation down to storage layer

klsince commented on a change in pull request #7820:
URL: https://github.com/apache/pinot/pull/7820#discussion_r759502515



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/common/DataFetcher.java
##########
@@ -100,6 +101,19 @@ public void fetchIntValues(String column, int[] inDocIds, int length, int[] outV
     _columnValueReaderMap.get(column).readIntValues(inDocIds, length, outValues);
   }
 
+  /**
+   * Fetch the int values from a JSON column.

Review comment:
       nit: is `JSON column` too specific for the method? 

##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/evaluator/TransformEvaluator.java
##########
@@ -0,0 +1,152 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.segment.spi.evaluator;
+
+import org.apache.pinot.segment.spi.index.reader.Dictionary;
+import org.apache.pinot.segment.spi.index.reader.ForwardIndexReader;
+import org.apache.pinot.segment.spi.index.reader.ForwardIndexReaderContext;
+
+
+public interface TransformEvaluator {
+  /**
+   * Evaluate the JSON path and fill the value buffer

Review comment:
       nit: `JSON path` is too specific here?

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/operator/transform/function/PushDownTransformFunction.java
##########
@@ -0,0 +1,124 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.operator.transform.function;
+
+import org.apache.pinot.core.operator.blocks.ProjectionBlock;
+import org.apache.pinot.segment.spi.evaluator.TransformEvaluator;
+
+
+public interface PushDownTransformFunction {

Review comment:
       nit: maybe add some comments about who can implement this interface?

##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/evaluator/json/JsonPathEvaluators.java
##########
@@ -0,0 +1,148 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.segment.spi.evaluator.json;
+
+import com.google.common.base.Preconditions;
+import java.lang.invoke.MethodHandle;
+import java.lang.invoke.MethodHandles;
+import java.lang.invoke.MethodType;
+import java.util.concurrent.atomic.AtomicReferenceFieldUpdater;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+
+/**
+ * Allows registration of a custom {@see JsonPathEvaluator} which can handle custom storage
+ * functionality also present in a plugin. A default evaluator which can handle all default
+ * storage types will be provided to delegate to when standard storage types are encountered.
+ */
+public final class JsonPathEvaluators {
+
+  private static final Logger LOGGER = LoggerFactory.getLogger(JsonPathEvaluators.class);
+
+  private static final AtomicReferenceFieldUpdater<JsonPathEvaluators, JsonPathEvaluatorProvider> UPDATER =
+      AtomicReferenceFieldUpdater.newUpdater(JsonPathEvaluators.class, JsonPathEvaluatorProvider.class, "_provider");
+  private static final JsonPathEvaluators INSTANCE = new JsonPathEvaluators();
+  private static final DefaultProvider DEFAULT_PROVIDER = new DefaultProvider();
+  private volatile JsonPathEvaluatorProvider _provider;
+
+  /**
+   * Registration point to override how JSON paths are evaluated. This should be used
+   * when a Pinot plugin has special storage capabilities. For instance, imagine a
+   * plugin with a raw forward index which stores JSON in a binary format which
+   * pinot-core is unaware of and cannot evaluate JSON paths against (pinot-core only
+   * understands true JSON documents). Whenever JSON paths are evaluated against this
+   * custom storage, different storage access operations may be required, and the provided
+   * {@see JsonPathEvaluator} can inspect the provided {@see ForwardIndexReader} to
+   * determine whether it is the custom implementation and evaluate the JSON path against
+   * the binary JSON managed by the custom reader. If it is not the custom implementation,
+   * then the evaluation should be delegated to the provided delegate.
+   *
+   * This prevents the interface {@see ForwardIndexReader} from needing to be able to model
+   * any plugin storage format, which creates flexibility for the kinds of data structure
+   * plugins can employ.
+   *
+   * @param provider provides {@see JsonPathEvaluator}
+   * @return true if registration is successful, false otherwise
+   */
+  public static boolean registerProvider(JsonPathEvaluatorProvider provider) {
+    Preconditions.checkArgument(provider != null, "");
+    if (!UPDATER.compareAndSet(INSTANCE, null, provider)) {
+      LOGGER.warn("failed to register {} - {} already registered", provider, INSTANCE._provider);
+      return false;
+    }
+    return true;
+  }
+
+  /**
+   * pinot-core must construct {@see JsonPathEvaluator} via this method to ensure it uses
+   * the registered implementation. Using the registered implementation allows pinot-core
+   * to evaluate JSON paths against data structures it doesn't understand or model.
+   * @param jsonPath the JSON path
+   * @param defaultValue the default value
+   * @return a JSON path evaluator which must understand all possible storage representations of JSON.
+   */
+  public static JsonPathEvaluator create(String jsonPath, Object defaultValue) {
+    // plugins compose and delegate to the default implementation.
+    JsonPathEvaluator defaultEvaluator = DEFAULT_PROVIDER.create(jsonPath, defaultValue);
+    return Holder.PROVIDER.create(defaultEvaluator, jsonPath, defaultValue);
+  }
+
+  /**
+   * Storing the registered evaluator in this holder and initialising it during
+   * the class load gives the best of both worlds: plugins have until the first
+   * JSON path evaluation to register an evaluator via
+   * {@see JsonPathEvaluators#registerProvider}, but once this class is loaded,
+   * the provider is constant and calls may be optimise aggressively by the JVM
+   * in ways which are impossible with a volatile reference.
+   */
+  private static final class Holder {

Review comment:
       👍 

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/operator/blocks/TransformBlock.java
##########
@@ -78,4 +79,54 @@ public BlockDocIdValueSet getBlockDocIdValueSet() {
   public BlockMetadata getMetadata() {
     throw new UnsupportedOperationException();
   }
+
+  @Override
+  public void pushDown(String column, TransformEvaluator evaluator, int[] buffer) {
+    _projectionBlock.pushDown(column, evaluator, buffer);

Review comment:
       any need to check `if (expression.getType() == ExpressionContext.Type.IDENTIFIER)` like L51 above? iiuc, we pushDown only when there is no transforms, i.e. a pass-through to projection block. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org