You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/01/08 20:20:00 UTC

[jira] [Commented] (DRILL-7437) Storage Plugin for Generic HTTP REST API

    [ https://issues.apache.org/jira/browse/DRILL-7437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17010992#comment-17010992 ] 

ASF GitHub Bot commented on DRILL-7437:
---------------------------------------

cgivre commented on pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r364424899
 
 

 ##########
 File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpGroupScan.java
 ##########
 @@ -0,0 +1,170 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.http;
+
+import java.util.Arrays;
+import java.util.List;
+import java.util.Objects;
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import org.apache.drill.common.expression.SchemaPath;
+
+import org.apache.drill.exec.physical.base.AbstractGroupScan;
+import org.apache.drill.exec.physical.base.GroupScan;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.ScanStats;
+import org.apache.drill.exec.physical.base.ScanStats.GroupScanProperty;
+import org.apache.drill.exec.physical.base.SubScan;
+import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.shaded.guava.com.google.common.base.MoreObjects;
+import org.apache.drill.shaded.guava.com.google.common.base.Preconditions;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+@JsonTypeName("http-scan")
+public class HttpGroupScan extends AbstractGroupScan {
+  private static final Logger logger = LoggerFactory.getLogger(HttpGroupScan.class);
+
+  private List<SchemaPath> columns;
+  private final HttpScanSpec httpScanSpec;
+  private final HttpStoragePluginConfig config;
+
+  public HttpGroupScan (
+    HttpStoragePluginConfig config,
+    HttpScanSpec scanSpec,
+    List<SchemaPath> columns
+  ) {
+    super("no-user");
+    this.config = config;
+    this.httpScanSpec = scanSpec;
+    this.columns = columns == null || columns.size() == 0 ? ALL_COLUMNS : columns;
+  }
+
+  public HttpGroupScan(HttpGroupScan that) {
+    super(that);
+    config = that.config();
+    httpScanSpec = that.httpScanSpec();
+    columns = that.getColumns();
+  }
+
+  @JsonCreator
+  public HttpGroupScan(
+    @JsonProperty("config") HttpStoragePluginConfig config,
+    @JsonProperty("columns") List<SchemaPath> columns,
+    @JsonProperty("httpScanSpec") HttpScanSpec httpScanSpec,
+    @JacksonInject StoragePluginRegistry engineRegistry
+  ) {
+    super("no-user");
+    this.config = config;
+    this.columns = columns;
+    this.httpScanSpec = httpScanSpec;
+  }
+
+  @JsonProperty("config")
+  public HttpStoragePluginConfig config() { return config; }
+
+  @JsonProperty("columns")
+  public List<SchemaPath> columns() { return columns; }
+
+  @JsonProperty("httpScanSpec")
+  public HttpScanSpec httpScanSpec() { return httpScanSpec; }
+
+  @Override
+  public void applyAssignments(List<DrillbitEndpoint> endpoints) {
+    logger.debug("HttpGroupScan applyAssignments");
+  }
+
+  @Override
+  @JsonIgnore
+  public int getMaxParallelizationWidth() {
+    return 0;
+  }
+
+  @Override
+  public boolean canPushdownProjects(List<SchemaPath> columns) {
+    return true;
+  }
+
+  @Override
+  public SubScan getSpecificScan(int minorFragmentId) {
+    logger.debug("HttpGroupScan getSpecificScan");
+    return new HttpSubScan(config, httpScanSpec, columns);
+  }
+
+  @Override
+  public GroupScan clone(List<SchemaPath> columns) {
+    logger.debug("HttpGroupScan clone {}", columns);
+    HttpGroupScan newScan = new HttpGroupScan(this);
+    newScan.columns = columns;
 
 Review comment:
   I hear what you're saying, but it still needs the `scanSpec`.  Where should that come from?  
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Storage Plugin for Generic HTTP REST API
> ----------------------------------------
>
>                 Key: DRILL-7437
>                 URL: https://issues.apache.org/jira/browse/DRILL-7437
>             Project: Apache Drill
>          Issue Type: New Feature
>            Reporter: Charles Givre
>            Assignee: Charles Givre
>            Priority: Minor
>             Fix For: Future
>
>
> In many data analytic situations there is a need to obtain reference data which is volatile or hosted on a service with a REST API.  
> For instance, consider the case of a financial dataset which you want to run a currency conversion.  Or in the security arena, an organization might have a service that returns network information about an IT asset.  The goal being to enable Drill to quickly incorporate external data that is only accessible via REST API. 
> This plugin is not intended to be a substitute for dedicated storage plugins with systems that use a REST API, such as Apache Solr or ElasticSearch.  
> This plugin is based on several projects that were posted on github but never completed or submitted to Drill.  Posted here for attribution:
>  * [https://github.com/kevinlynx/drill-storage-http]
>  * [https://github.com/mayunSaicmotor/drill-storage-http]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)