You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@drill.apache.org by Vicente Arteaga <vi...@hotmail.com> on 2014/12/02 12:40:33 UTC

Configuration example for a s3:// source for Drill

Hi!
I've attended the nosqlmatters2014, and am eager to test Drill! Good work!I've played around a bit, but now I would like to connect to s3, where I currently store the files.
Could you please provide me with a configuration example? I don't see where to fill in the AWS credentials, and I haven't found in the wiki nor documentation any example for that.
Thanks! Best Regards,Vicente

RE: Configuration example for a s3:// source for Drill

Posted by Nick Amato <na...@maprtech.com>.

Vicente,

The S3 configuration is similar to a file configuration.  At the bottom of
this email is a configuration that works (set 'connection' to
's3://bucketname').  A couple of other steps are required on each drill
node:
- Download the jets3t-*.jar file from:
https://jets3t.s3.amazonaws.com/downloads.html and copy it into
$DRILL_HOME/opt/drill/jars/3rdparty/
- Add your S3 credentials to $DRILL_HOME/conf/core-site.xml  -- see my
example below -- you need to define these properties:
    - fs.s3.awsAccessKeyId
    - fs.s3.awsSecretAccessKey
    - fs.s3n.awsAccessKeyId
    - fs.s3n.awsSecretAccessKey
- Edit $DRILL_HOME/bin/hadoop_excludes.txt and remove 'jets3t'

You should be able to run queries after that...

Nick


Example storage plugin configuration:

{
  "type": "file",
  "enabled": true,
  "connection": "s3://bucketname",
  "workspaces": {
    "default": {
      "location": "/",
      "writable": false,
      "storageformat": null
    },
    "root": {
      "location": "/",
      "writable": true,
      "storageformat": "json"
    }
  },
  "formats": {
    "psv": {
      "type": "text",
      "extensions": [
        "tbl"
      ],
      "delimiter": "|"
    },
    "csv": {
      "type": "text",
      "extensions": [
        "csv"
      ],
      "delimiter": ","
    },
    "tsv": {
      "type": "text",
      "extensions": [
        "tsv"
      ],
      "delimiter": "\t"
    },
    "parquet": {
      "type": "parquet"
    },
    "json": {
      "type": "json"
    }
  }
}

Example entries in core-site.xml:

<configuration>
<property>
  <name>fs.s3.awsAccessKeyId</name>
  <value>ID</value>
</property>

<property>
  <name>fs.s3.awsSecretAccessKey</name>
  <value>SECRET</value>
</property>

<property>
  <name>fs.s3n.awsAccessKeyId</name>
  <value>ID</value>
</property>

<property>
  <name>fs.s3n.awsSecretAccessKey</name>
  <value>SECRET</value>
</property>
</configuration>

-----Original Message-----
From: Vicente Arteaga [mailto:vicentearteaga@hotmail.com]
Sent: Tuesday, December 02, 2014 3:41 AM
To: drill-dev@incubator.apache.org
Subject: Configuration example for a s3:// source for Drill

Hi!
I've attended the nosqlmatters2014, and am eager to test Drill! Good
work!I've played around a bit, but now I would like to connect to s3,
where I currently store the files.
Could you please provide me with a configuration example? I don't see
where to fill in the AWS credentials, and I haven't found in the wiki nor
documentation any example for that.
Thanks! Best Regards,Vicente

RE: Configuration example for a s3:// source for Drill

Posted by Nick Amato <na...@maprtech.com>.

Vicente,

The S3 configuration is similar to a file configuration.  At the bottom of
this email is a configuration that works (set 'connection' to
's3://bucketname').  A couple of other steps are required on each drill
node:
- Download the jets3t-*.jar file from:
https://jets3t.s3.amazonaws.com/downloads.html and copy it into
$DRILL_HOME/opt/drill/jars/3rdparty/
- Add your S3 credentials to $DRILL_HOME/conf/core-site.xml  -- see my
example below -- you need to define these properties:
    - fs.s3.awsAccessKeyId
    - fs.s3.awsSecretAccessKey
    - fs.s3n.awsAccessKeyId
    - fs.s3n.awsSecretAccessKey
- Edit $DRILL_HOME/bin/hadoop_excludes.txt and remove 'jets3t'

You should be able to run queries after that...

Nick


Example storage plugin configuration:

{
  "type": "file",
  "enabled": true,
  "connection": "s3://bucketname",
  "workspaces": {
    "default": {
      "location": "/",
      "writable": false,
      "storageformat": null
    },
    "root": {
      "location": "/",
      "writable": true,
      "storageformat": "json"
    }
  },
  "formats": {
    "psv": {
      "type": "text",
      "extensions": [
        "tbl"
      ],
      "delimiter": "|"
    },
    "csv": {
      "type": "text",
      "extensions": [
        "csv"
      ],
      "delimiter": ","
    },
    "tsv": {
      "type": "text",
      "extensions": [
        "tsv"
      ],
      "delimiter": "\t"
    },
    "parquet": {
      "type": "parquet"
    },
    "json": {
      "type": "json"
    }
  }
}

Example entries in core-site.xml:

<configuration>
<property>
  <name>fs.s3.awsAccessKeyId</name>
  <value>ID</value>
</property>

<property>
  <name>fs.s3.awsSecretAccessKey</name>
  <value>SECRET</value>
</property>

<property>
  <name>fs.s3n.awsAccessKeyId</name>
  <value>ID</value>
</property>

<property>
  <name>fs.s3n.awsSecretAccessKey</name>
  <value>SECRET</value>
</property>
</configuration>


-----Original Message-----
From: Vicente Arteaga [mailto:vicentearteaga@hotmail.com]
Sent: Tuesday, December 02, 2014 3:41 AM
To: drill-dev@incubator.apache.org
Subject: Configuration example for a s3:// source for Drill

Hi!
I've attended the nosqlmatters2014, and am eager to test Drill! Good
work!I've played around a bit, but now I would like to connect to s3,
where I currently store the files.
Could you please provide me with a configuration example? I don't see
where to fill in the AWS credentials, and I haven't found in the wiki nor
documentation any example for that.
Thanks! Best Regards,Vicente