You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Sudheesh Katkam (JIRA)" <ji...@apache.org> on 2016/09/16 18:32:20 UTC
[jira] [Comment Edited] (DRILL-4892) Swift Documentation

    [ https://issues.apache.org/jira/browse/DRILL-4892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15497034#comment-15497034 ] 

Sudheesh Katkam edited comment on DRILL-4892 at 9/16/16 6:31 PM:
-----------------------------------------------------------------

From email:

{quote}
AFAIK, there is no documentation. I am not sure anyone has tried it before. That said, from \[1\], Swift enables Apache Hadoop applications - including MapReduce jobs, read and write data to and from instances of the OpenStack Swift object store. And Drill uses the HDFS client library. So using Swift through Drill should be possible.

My guess.. Create storage plugin named “swift”, copy the contents from the “dfs” plugin. I am not sure what the contents of “swift” should be exactly; see \[1\] and \[2\]. The parameters and values mentioned in the “Configuring” section in \[1\] should be provided through the “config” map in the storage plugin (or maybe through conf/core-site.xml in the Drill installation directory).

Something like:
\{
  "type": "file",
  "enabled": true,
  "connection": "swift://dmitry.privatecloud/out/results",
  "workspaces": \{
    ...
  \},
  "formats": \{
    ...
  \}
  "config": \{
    ...
  \}
\}

A roundabout way could use Swift through S3 \[3\]. Again, I do not know the exact configuration details.

Once you get things to work, you can also add a section to the Drill docs based on your experience!

Thank you,
Sudheesh

\[1\] https://hadoop.apache.org/docs/stable2/hadoop-openstack/index.html
\[2\] http://drill.apache.org/docs/s3-storage-plugin/
\[3\] https://github.com/openstack/swift3
{quote}


was (Author: sudheeshkatkam):
From email:

{quote}
AFAIK, there is no documentation. I am not sure anyone has tried it before. That said, from \[1\], Swift enables Apache Hadoop applications - including MapReduce jobs, read and write data to and from instances of the OpenStack Swift object store. And Drill uses the HDFS client library. So using Swift through Drill should be possible.

My guess.. Create storage plugin named “swift”, copy the contents from the “dfs” plugin. I am not sure what the contents of “swift” should be exactly; see \[1\] and \[2\]. The parameters and values mentioned in the “Configuring” section in \[1\] should be provided through the “config” map in the storage plugin (or maybe through conf/core-site.xml in the Drill installation directory).

Something like:
{
  "type": "file",
  "enabled": true,
  "connection": "swift://dmitry.privatecloud/out/results",
  "workspaces": \{
    ...
  \},
  "formats": \{
    ...
  \}
  "config": \{
    ...
  \}
}

A roundabout way could use Swift through S3 \[3\]. Again, I do not know the exact configuration details.

Once you get things to work, you can also add a section to the Drill docs based on your experience!

Thank you,
Sudheesh

\[1\] https://hadoop.apache.org/docs/stable2/hadoop-openstack/index.html
\[2\] http://drill.apache.org/docs/s3-storage-plugin/
\[3\] https://github.com/openstack/swift3
{quote}

> Swift Documentation
> -------------------
>
>                 Key: DRILL-4892
>                 URL: https://issues.apache.org/jira/browse/DRILL-4892
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Documentation
>    Affects Versions: 1.6.0, 1.8.0
>            Reporter: Matt Keranen
>
> The Drill FAQ (https://drill.apache.org/faq/), suggest Swift is a datasource:
> "Cloud storage: Amazon S3, Google Cloud Storage, Azure Blog Storage, Swift"
> However there appears to be no documentation (?)
> Swift specific docs would be very useful. We have a large Swift installation and using Drill over files in it would be a valuable feature.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)