You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by GitBox <gi...@apache.org> on 2020/05/28 14:38:57 UTC

[GitHub] [drill] cgivre commented on a change in pull request #1961: DRILL-3637: Add Elasticsearch Storage Plugin

cgivre commented on a change in pull request #1961:
URL: https://github.com/apache/drill/pull/1961#discussion_r431888630



##########
File path: contrib/storage-elastic/README.md
##########
@@ -0,0 +1,36 @@
+# ElasticSearch Storage Plugin 
+
+This plugin enables you to query ElasticSearch from Apache Drill.  
+
+Tested with ElasticSearch versions:
+* 5.6

Review comment:
       Hi Sanel, 
   Thanks for the feedback. Bottom line: yes, this needs to be updated to use the latest ES libraries.  
   
   Here's some backstory. I wanted to learn how to write storage plugins for Drill and for a long time.  I started by looking for existing work that was out there and found a bunch of plugins that people had written to connect Drill to various systems including:
   * Elasticsearch
   * Solr
   * Cassandra
   * HTTP/Rest (Now committed)
   * Couchbase
   * Druid.  (There is currently an active PR that is close to being done for Druid)
   
   Anyway, I took a look at the ES and Cassandra plugins and wanted to see if I could get them to work at all with the latest versions of Drill.  I was able to get the Cassandra and ES plugins to work.  My goal is to get these working and committed to Drill as I feel that they would be extremely valuable for users.  
   
   With that said, ES is a challenge for a few reasons.  A lot of people have worked on it, and one of the main challenges is that there are significant differences between the major versions.   The end result is that I don't think it's possible to use the high level API client because it is tied to the version of ES. IE You can't use v7 high level client with ES v6.  If I'm wrong about this, please let me know as that would greatly simplify an ES plugin.
   
   I took a look at the EOL dates [1] for ES and it looks like versions 6 and 7 are really the versions we should support at this time.  
   
   The other thing I'm trying to figure out is whether it is possible to use the Calcite adapters for query planning.   Calcite (which Drill uses for query planning) has an adapter for ES with all kinds of optimizer rules. [2]. If I could figure out how to use these rules in the Drill context, it would make an ES plugin really solid.  
   
   I'm working on a few other things at the moment, but I haven't abandoned this PR.  I put it up as a draft PR so that I can get feedback and assistance as I'm not an ES expert by any means. 
   
   [1]: https://www.elastic.co/support/eol
   [2]: https://calcite.apache.org/docs/elasticsearch_adapter.html




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org