You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by Shuporno Choudhury <sh...@manthan.com> on 2017/05/22 10:27:24 UTC

Writing to s3 using Drill

Hi,

Is it possible to write to a folder in an s3 bucket using the *s3.tmp*
workspace?
Whenever I try, it gives me the follwing error:

*Error: VALIDATION ERROR: Schema [s3.tmp] is not valid with respect to
either root schema or current default schema.*
*Current default schema:  s3.root*

Also, s3.tmp doesn't appear while using the command "*show schemas*" though
the tmp workspace exists in the web console

I am using Drill Version 1.10; embedded mode on my local system.

However, I have no problem reading from an s3 bucket, the problem is only
writing to a s3 bucket.
-- 
Regards,
Shuporno Choudhury

Re: Writing to s3 using Drill

Posted by Gautam Parai <gp...@mapr.com>.
Hi Shuporno,


Could you please specify the configuration from S3 storage plugin here and the output of `show schemas` as it pertains to s3? Is `writable` set to true?


Gautam

________________________________
From: Shuporno Choudhury <sh...@manthan.com>
Sent: Monday, May 22, 2017 3:27:24 AM
To: user@drill.apache.org
Subject: Writing to s3 using Drill

Hi,

Is it possible to write to a folder in an s3 bucket using the *s3.tmp*
workspace?
Whenever I try, it gives me the follwing error:

*Error: VALIDATION ERROR: Schema [s3.tmp] is not valid with respect to
either root schema or current default schema.*
*Current default schema:  s3.root*

Also, s3.tmp doesn't appear while using the command "*show schemas*" though
the tmp workspace exists in the web console

I am using Drill Version 1.10; embedded mode on my local system.

However, I have no problem reading from an s3 bucket, the problem is only
writing to a s3 bucket.
--
Regards,
Shuporno Choudhury

Re: Writing to s3 using Drill

Posted by Gautam Parai <gp...@mapr.com>.
Hi Shuporno,


Did you try following the suggestions from Abhishek? Please let us know your observations. Also, please share the CTAS command you are using to write to s3.


Gautam


________________________________
From: Shuporno Choudhury <sh...@manthan.com>
Sent: Thursday, May 25, 2017 12:11:40 AM
To: user@drill.apache.org
Subject: Re: Writing to s3 using Drill

My s3 plugin info is as follows:

{
  "type": "file",
  "enabled": true,
  "connection": "s3a://abcd",
  "config": {
    "fs.s3a.access.key": "abcd",
    "fs.s3a.secret.key": "abcd"
  },
  "workspaces": {
    "root": {
      "location": "/",
      "writable": false,
      "defaultInputFormat": null
    },
    "tmp": {
      "location": "/",
      "writable": *true*,
      "defaultInputFormat": "parquet"
    }
  }


I have removed the info about the formats to keep the mail small.
Also, I am using Dill on *Windows 10*

On Mon, May 22, 2017 at 3:57 PM, Shuporno Choudhury <
shuporno.choudhury@manthan.com> wrote:

> Hi,
>
> Is it possible to write to a folder in an s3 bucket using the *s3.tmp*
> workspace?
> Whenever I try, it gives me the follwing error:
>
> *Error: VALIDATION ERROR: Schema [s3.tmp] is not valid with respect to
> either root schema or current default schema.*
> *Current default schema:  s3.root*
>
> Also, s3.tmp doesn't appear while using the command "*show schemas*"
> though the tmp workspace exists in the web console
>
> I am using Drill Version 1.10; embedded mode on my local system.
>
> However, I have no problem reading from an s3 bucket, the problem is only
> writing to a s3 bucket.
> --
> Regards,
> Shuporno Choudhury
>



--
Regards,
Shuporno Choudhury

Re: Writing to s3 using Drill

Posted by Shuporno Choudhury <sh...@manthan.com>.
Hi Nitin,

   The "tmp" object exists inside the s3 bucket. Even though, it throws the
same error:

        Error: SYSTEM ERROR: IllegalArgumentException: URI has an authority
component
        Fragment 0:0


On Fri, May 26, 2017 at 5:14 PM, Nitin Pawar <ni...@gmail.com>
wrote:

> you have to create a tmp object in your bucket to make it work.
> s3://bucket_name/tmp has to be created and then it should work
>
> On Fri, May 26, 2017 at 5:02 PM, Shuporno Choudhury <
> shuporno.choudhury@manthan.com> wrote:
>
> > Hi Nitin,
> >
> > Thanks for the config settings.
> >
> > Now, after entering those config settings
> >     1. s3.tmp does appear in the "show schemas" result
> >     2. Also, it doesn't disappear when I add a custom folder in the
> > location attribute
> >
> > But when I try to run a CTAS statement, I get the following error:
> >
> > *Error: SYSTEM ERROR: IllegalArgumentException: URI has an authority
> > component*
> > *Fragment 0:0*
> >
> > Query that I am trying to run:
> > *create table s3.tmp.`abcd` as select 1 from (values(1));*
> >
> > However, this query runs when I use dfs.tmp instead of s3.tmp
> >
> > On Fri, May 26, 2017 at 12:44 PM, Nitin Pawar <ni...@gmail.com>
> > wrote:
> >
> > > Can you try with following s3 config
> > >
> > > {
> > >   "type": "file",
> > >   "enabled": true,
> > >   "connection": "s3a://bucket_name",
> > >   "config": {
> > >
> > >     "fs.s3a.connection.maximum": "10000",
> > >     "fs.s3a.access.key": "access_key",
> > >     "fs.s3a.secret.key": "secret_key",
> > >     "fs.s3a.buffer.dir": "/tmp",
> > >     "fs.s3a.multipart.size": "10485760",
> > >     "fs.s3a.multipart.threshold": "104857600"
> > >   },
> > >   "workspaces": {
> > >     "root": {
> > >       "location": "/",
> > >       "writable": false,
> > >       "defaultInputFormat": null
> > >     },
> > >     "tmp": {
> > >       "location": "/tmp",
> > >       "writable": true,
> > >       "defaultInputFormat": null
> > >     }
> > >   },
> > >   "formats": {
> > >     "psv": {
> > >       "type": "text",
> > >       "extensions": [
> > >         "tbl"
> > >       ],
> > >       "delimiter": "|"
> > >     },
> > >     "csv": {
> > >       "type": "text",
> > >       "extensions": [
> > >         "csv"
> > >       ],
> > >       "extractHeader": true,
> > >       "delimiter": ","
> > >     },
> > >     "tsv": {
> > >       "type": "text",
> > >       "extensions": [
> > >         "tsv"
> > >       ],
> > >       "delimiter": "\t"
> > >     },
> > >     "parquet": {
> > >       "type": "parquet"
> > >     },
> > >     "json": {
> > >       "type": "json",
> > >       "extensions": [
> > >         "json"
> > >       ]
> > >     },
> > >     "avro": {
> > >       "type": "avro"
> > >     },
> > >     "sequencefile": {
> > >       "type": "sequencefile",
> > >       "extensions": [
> > >         "seq"
> > >       ]
> > >     },
> > >     "csvh": {
> > >       "type": "text",
> > >       "extensions": [
> > >         "csvh"
> > >       ],
> > >       "extractHeader": true,
> > >       "delimiter": ","
> > >     }
> > >   }
> > > }
> > >
> > > On Fri, May 26, 2017 at 10:29 AM, Shuporno Choudhury <
> > > shuporno.choudhury@manthan.com> wrote:
> > >
> > > > Hi,
> > > > Can someone at Drill help me with issue please?
> > > >
> > > > On Thu, May 25, 2017 at 1:33 PM, Shuporno Choudhury <
> > > > shuporno.choudhury@manthan.com> wrote:
> > > >
> > > > > HI,
> > > > >
> > > > > I corrected the "show schemas"  output by putting only "/" in the
> > > > > "location" . Now it shows s3.tmp in the output.
> > > > >
> > > > > But, it has a weird problem.
> > > > > The moment I add a folder to the location, eg: "/myfolder", then
> > s3.tmp
> > > > > vanishes from the "show schemas" output.
> > > > >
> > > > > Also, when I try to write into s3, I get the following error:
> > > > >
> > > > > Exception in thread "drill-executor-9" java.lang.
> > UnsatisfiedLinkError:
> > > > > org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(
> > > > > Ljava/lang/String;I)Z
> > > > >         at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(
> > > Native
> > > > > Method)+--+
> > > > >         at org.apache.hadoop.io.nativeio.NativeIO$Windows.access(
> > > > > NativeIO.java:609)
> > > > >
> > > > > This is only a snippet of the error associated with writing to s3
> > > > >
> > > > > On Thu, May 25, 2017 at 12:41 PM, Shuporno Choudhury <
> > > > > shuporno.choudhury@manthan.com> wrote:
> > > > >
> > > > >> My s3 plugin info is as follows:
> > > > >>
> > > > >> {
> > > > >>   "type": "file",
> > > > >>   "enabled": true,
> > > > >>   "connection": "s3a://abcd",
> > > > >>   "config": {
> > > > >>     "fs.s3a.access.key": "abcd",
> > > > >>     "fs.s3a.secret.key": "abcd"
> > > > >>   },
> > > > >>   "workspaces": {
> > > > >>     "root": {
> > > > >>       "location": "/",
> > > > >>       "writable": false,
> > > > >>       "defaultInputFormat": null
> > > > >>     },
> > > > >>     "tmp": {
> > > > >>       "location": "/",
> > > > >>       "writable": *true*,
> > > > >>       "defaultInputFormat": "parquet"
> > > > >>     }
> > > > >>   }
> > > > >>
> > > > >>
> > > > >> I have removed the info about the formats to keep the mail small.
> > > > >> Also, I am using Dill on *Windows 10*
> > > > >>
> > > > >> On Mon, May 22, 2017 at 3:57 PM, Shuporno Choudhury <
> > > > >> shuporno.choudhury@manthan.com> wrote:
> > > > >>
> > > > >>> Hi,
> > > > >>>
> > > > >>> Is it possible to write to a folder in an s3 bucket using the
> > > *s3.tmp*
> > > > >>> workspace?
> > > > >>> Whenever I try, it gives me the follwing error:
> > > > >>>
> > > > >>> *Error: VALIDATION ERROR: Schema [s3.tmp] is not valid with
> respect
> > > to
> > > > >>> either root schema or current default schema.*
> > > > >>> *Current default schema:  s3.root*
> > > > >>>
> > > > >>> Also, s3.tmp doesn't appear while using the command "*show
> > schemas*"
> > > > >>> though the tmp workspace exists in the web console
> > > > >>>
> > > > >>> I am using Drill Version 1.10; embedded mode on my local system.
> > > > >>>
> > > > >>> However, I have no problem reading from an s3 bucket, the problem
> > is
> > > > >>> only writing to a s3 bucket.
> > > > >>> --
> > > > >>> Regards,
> > > > >>> Shuporno Choudhury
> > > > >>>
> > > > >>
> > > > >>
> > > > >>
> > > > >> --
> > > > >> Regards,
> > > > >> Shuporno Choudhury
> > > > >>
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Regards,
> > > > > Shuporno Choudhury
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Regards,
> > > > Shuporno Choudhury
> > > >
> > >
> > >
> > >
> > > --
> > > Nitin Pawar
> > >
> >
> >
> >
> > --
> > Regards,
> > Shuporno Choudhury
> >
>
>
>
> --
> Nitin Pawar
>



-- 
Regards,
Shuporno Choudhury

Re: Writing to s3 using Drill

Posted by Nitin Pawar <ni...@gmail.com>.
you have to create a tmp object in your bucket to make it work.
s3://bucket_name/tmp has to be created and then it should work

On Fri, May 26, 2017 at 5:02 PM, Shuporno Choudhury <
shuporno.choudhury@manthan.com> wrote:

> Hi Nitin,
>
> Thanks for the config settings.
>
> Now, after entering those config settings
>     1. s3.tmp does appear in the "show schemas" result
>     2. Also, it doesn't disappear when I add a custom folder in the
> location attribute
>
> But when I try to run a CTAS statement, I get the following error:
>
> *Error: SYSTEM ERROR: IllegalArgumentException: URI has an authority
> component*
> *Fragment 0:0*
>
> Query that I am trying to run:
> *create table s3.tmp.`abcd` as select 1 from (values(1));*
>
> However, this query runs when I use dfs.tmp instead of s3.tmp
>
> On Fri, May 26, 2017 at 12:44 PM, Nitin Pawar <ni...@gmail.com>
> wrote:
>
> > Can you try with following s3 config
> >
> > {
> >   "type": "file",
> >   "enabled": true,
> >   "connection": "s3a://bucket_name",
> >   "config": {
> >
> >     "fs.s3a.connection.maximum": "10000",
> >     "fs.s3a.access.key": "access_key",
> >     "fs.s3a.secret.key": "secret_key",
> >     "fs.s3a.buffer.dir": "/tmp",
> >     "fs.s3a.multipart.size": "10485760",
> >     "fs.s3a.multipart.threshold": "104857600"
> >   },
> >   "workspaces": {
> >     "root": {
> >       "location": "/",
> >       "writable": false,
> >       "defaultInputFormat": null
> >     },
> >     "tmp": {
> >       "location": "/tmp",
> >       "writable": true,
> >       "defaultInputFormat": null
> >     }
> >   },
> >   "formats": {
> >     "psv": {
> >       "type": "text",
> >       "extensions": [
> >         "tbl"
> >       ],
> >       "delimiter": "|"
> >     },
> >     "csv": {
> >       "type": "text",
> >       "extensions": [
> >         "csv"
> >       ],
> >       "extractHeader": true,
> >       "delimiter": ","
> >     },
> >     "tsv": {
> >       "type": "text",
> >       "extensions": [
> >         "tsv"
> >       ],
> >       "delimiter": "\t"
> >     },
> >     "parquet": {
> >       "type": "parquet"
> >     },
> >     "json": {
> >       "type": "json",
> >       "extensions": [
> >         "json"
> >       ]
> >     },
> >     "avro": {
> >       "type": "avro"
> >     },
> >     "sequencefile": {
> >       "type": "sequencefile",
> >       "extensions": [
> >         "seq"
> >       ]
> >     },
> >     "csvh": {
> >       "type": "text",
> >       "extensions": [
> >         "csvh"
> >       ],
> >       "extractHeader": true,
> >       "delimiter": ","
> >     }
> >   }
> > }
> >
> > On Fri, May 26, 2017 at 10:29 AM, Shuporno Choudhury <
> > shuporno.choudhury@manthan.com> wrote:
> >
> > > Hi,
> > > Can someone at Drill help me with issue please?
> > >
> > > On Thu, May 25, 2017 at 1:33 PM, Shuporno Choudhury <
> > > shuporno.choudhury@manthan.com> wrote:
> > >
> > > > HI,
> > > >
> > > > I corrected the "show schemas"  output by putting only "/" in the
> > > > "location" . Now it shows s3.tmp in the output.
> > > >
> > > > But, it has a weird problem.
> > > > The moment I add a folder to the location, eg: "/myfolder", then
> s3.tmp
> > > > vanishes from the "show schemas" output.
> > > >
> > > > Also, when I try to write into s3, I get the following error:
> > > >
> > > > Exception in thread "drill-executor-9" java.lang.
> UnsatisfiedLinkError:
> > > > org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(
> > > > Ljava/lang/String;I)Z
> > > >         at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(
> > Native
> > > > Method)+--+
> > > >         at org.apache.hadoop.io.nativeio.NativeIO$Windows.access(
> > > > NativeIO.java:609)
> > > >
> > > > This is only a snippet of the error associated with writing to s3
> > > >
> > > > On Thu, May 25, 2017 at 12:41 PM, Shuporno Choudhury <
> > > > shuporno.choudhury@manthan.com> wrote:
> > > >
> > > >> My s3 plugin info is as follows:
> > > >>
> > > >> {
> > > >>   "type": "file",
> > > >>   "enabled": true,
> > > >>   "connection": "s3a://abcd",
> > > >>   "config": {
> > > >>     "fs.s3a.access.key": "abcd",
> > > >>     "fs.s3a.secret.key": "abcd"
> > > >>   },
> > > >>   "workspaces": {
> > > >>     "root": {
> > > >>       "location": "/",
> > > >>       "writable": false,
> > > >>       "defaultInputFormat": null
> > > >>     },
> > > >>     "tmp": {
> > > >>       "location": "/",
> > > >>       "writable": *true*,
> > > >>       "defaultInputFormat": "parquet"
> > > >>     }
> > > >>   }
> > > >>
> > > >>
> > > >> I have removed the info about the formats to keep the mail small.
> > > >> Also, I am using Dill on *Windows 10*
> > > >>
> > > >> On Mon, May 22, 2017 at 3:57 PM, Shuporno Choudhury <
> > > >> shuporno.choudhury@manthan.com> wrote:
> > > >>
> > > >>> Hi,
> > > >>>
> > > >>> Is it possible to write to a folder in an s3 bucket using the
> > *s3.tmp*
> > > >>> workspace?
> > > >>> Whenever I try, it gives me the follwing error:
> > > >>>
> > > >>> *Error: VALIDATION ERROR: Schema [s3.tmp] is not valid with respect
> > to
> > > >>> either root schema or current default schema.*
> > > >>> *Current default schema:  s3.root*
> > > >>>
> > > >>> Also, s3.tmp doesn't appear while using the command "*show
> schemas*"
> > > >>> though the tmp workspace exists in the web console
> > > >>>
> > > >>> I am using Drill Version 1.10; embedded mode on my local system.
> > > >>>
> > > >>> However, I have no problem reading from an s3 bucket, the problem
> is
> > > >>> only writing to a s3 bucket.
> > > >>> --
> > > >>> Regards,
> > > >>> Shuporno Choudhury
> > > >>>
> > > >>
> > > >>
> > > >>
> > > >> --
> > > >> Regards,
> > > >> Shuporno Choudhury
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > Regards,
> > > > Shuporno Choudhury
> > > >
> > >
> > >
> > >
> > > --
> > > Regards,
> > > Shuporno Choudhury
> > >
> >
> >
> >
> > --
> > Nitin Pawar
> >
>
>
>
> --
> Regards,
> Shuporno Choudhury
>



-- 
Nitin Pawar

Re: Writing to s3 using Drill

Posted by Shuporno Choudhury <sh...@manthan.com>.
Hi Nitin,

Thanks for the config settings.

Now, after entering those config settings
    1. s3.tmp does appear in the "show schemas" result
    2. Also, it doesn't disappear when I add a custom folder in the
location attribute

But when I try to run a CTAS statement, I get the following error:

*Error: SYSTEM ERROR: IllegalArgumentException: URI has an authority
component*
*Fragment 0:0*

Query that I am trying to run:
*create table s3.tmp.`abcd` as select 1 from (values(1));*

However, this query runs when I use dfs.tmp instead of s3.tmp

On Fri, May 26, 2017 at 12:44 PM, Nitin Pawar <ni...@gmail.com>
wrote:

> Can you try with following s3 config
>
> {
>   "type": "file",
>   "enabled": true,
>   "connection": "s3a://bucket_name",
>   "config": {
>
>     "fs.s3a.connection.maximum": "10000",
>     "fs.s3a.access.key": "access_key",
>     "fs.s3a.secret.key": "secret_key",
>     "fs.s3a.buffer.dir": "/tmp",
>     "fs.s3a.multipart.size": "10485760",
>     "fs.s3a.multipart.threshold": "104857600"
>   },
>   "workspaces": {
>     "root": {
>       "location": "/",
>       "writable": false,
>       "defaultInputFormat": null
>     },
>     "tmp": {
>       "location": "/tmp",
>       "writable": true,
>       "defaultInputFormat": null
>     }
>   },
>   "formats": {
>     "psv": {
>       "type": "text",
>       "extensions": [
>         "tbl"
>       ],
>       "delimiter": "|"
>     },
>     "csv": {
>       "type": "text",
>       "extensions": [
>         "csv"
>       ],
>       "extractHeader": true,
>       "delimiter": ","
>     },
>     "tsv": {
>       "type": "text",
>       "extensions": [
>         "tsv"
>       ],
>       "delimiter": "\t"
>     },
>     "parquet": {
>       "type": "parquet"
>     },
>     "json": {
>       "type": "json",
>       "extensions": [
>         "json"
>       ]
>     },
>     "avro": {
>       "type": "avro"
>     },
>     "sequencefile": {
>       "type": "sequencefile",
>       "extensions": [
>         "seq"
>       ]
>     },
>     "csvh": {
>       "type": "text",
>       "extensions": [
>         "csvh"
>       ],
>       "extractHeader": true,
>       "delimiter": ","
>     }
>   }
> }
>
> On Fri, May 26, 2017 at 10:29 AM, Shuporno Choudhury <
> shuporno.choudhury@manthan.com> wrote:
>
> > Hi,
> > Can someone at Drill help me with issue please?
> >
> > On Thu, May 25, 2017 at 1:33 PM, Shuporno Choudhury <
> > shuporno.choudhury@manthan.com> wrote:
> >
> > > HI,
> > >
> > > I corrected the "show schemas"  output by putting only "/" in the
> > > "location" . Now it shows s3.tmp in the output.
> > >
> > > But, it has a weird problem.
> > > The moment I add a folder to the location, eg: "/myfolder", then s3.tmp
> > > vanishes from the "show schemas" output.
> > >
> > > Also, when I try to write into s3, I get the following error:
> > >
> > > Exception in thread "drill-executor-9" java.lang.UnsatisfiedLinkError:
> > > org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(
> > > Ljava/lang/String;I)Z
> > >         at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(
> Native
> > > Method)+--+
> > >         at org.apache.hadoop.io.nativeio.NativeIO$Windows.access(
> > > NativeIO.java:609)
> > >
> > > This is only a snippet of the error associated with writing to s3
> > >
> > > On Thu, May 25, 2017 at 12:41 PM, Shuporno Choudhury <
> > > shuporno.choudhury@manthan.com> wrote:
> > >
> > >> My s3 plugin info is as follows:
> > >>
> > >> {
> > >>   "type": "file",
> > >>   "enabled": true,
> > >>   "connection": "s3a://abcd",
> > >>   "config": {
> > >>     "fs.s3a.access.key": "abcd",
> > >>     "fs.s3a.secret.key": "abcd"
> > >>   },
> > >>   "workspaces": {
> > >>     "root": {
> > >>       "location": "/",
> > >>       "writable": false,
> > >>       "defaultInputFormat": null
> > >>     },
> > >>     "tmp": {
> > >>       "location": "/",
> > >>       "writable": *true*,
> > >>       "defaultInputFormat": "parquet"
> > >>     }
> > >>   }
> > >>
> > >>
> > >> I have removed the info about the formats to keep the mail small.
> > >> Also, I am using Dill on *Windows 10*
> > >>
> > >> On Mon, May 22, 2017 at 3:57 PM, Shuporno Choudhury <
> > >> shuporno.choudhury@manthan.com> wrote:
> > >>
> > >>> Hi,
> > >>>
> > >>> Is it possible to write to a folder in an s3 bucket using the
> *s3.tmp*
> > >>> workspace?
> > >>> Whenever I try, it gives me the follwing error:
> > >>>
> > >>> *Error: VALIDATION ERROR: Schema [s3.tmp] is not valid with respect
> to
> > >>> either root schema or current default schema.*
> > >>> *Current default schema:  s3.root*
> > >>>
> > >>> Also, s3.tmp doesn't appear while using the command "*show schemas*"
> > >>> though the tmp workspace exists in the web console
> > >>>
> > >>> I am using Drill Version 1.10; embedded mode on my local system.
> > >>>
> > >>> However, I have no problem reading from an s3 bucket, the problem is
> > >>> only writing to a s3 bucket.
> > >>> --
> > >>> Regards,
> > >>> Shuporno Choudhury
> > >>>
> > >>
> > >>
> > >>
> > >> --
> > >> Regards,
> > >> Shuporno Choudhury
> > >>
> > >
> > >
> > >
> > > --
> > > Regards,
> > > Shuporno Choudhury
> > >
> >
> >
> >
> > --
> > Regards,
> > Shuporno Choudhury
> >
>
>
>
> --
> Nitin Pawar
>



-- 
Regards,
Shuporno Choudhury

Re: Writing to s3 using Drill

Posted by Nitin Pawar <ni...@gmail.com>.
Can you try with following s3 config

{
  "type": "file",
  "enabled": true,
  "connection": "s3a://bucket_name",
  "config": {

    "fs.s3a.connection.maximum": "10000",
    "fs.s3a.access.key": "access_key",
    "fs.s3a.secret.key": "secret_key",
    "fs.s3a.buffer.dir": "/tmp",
    "fs.s3a.multipart.size": "10485760",
    "fs.s3a.multipart.threshold": "104857600"
  },
  "workspaces": {
    "root": {
      "location": "/",
      "writable": false,
      "defaultInputFormat": null
    },
    "tmp": {
      "location": "/tmp",
      "writable": true,
      "defaultInputFormat": null
    }
  },
  "formats": {
    "psv": {
      "type": "text",
      "extensions": [
        "tbl"
      ],
      "delimiter": "|"
    },
    "csv": {
      "type": "text",
      "extensions": [
        "csv"
      ],
      "extractHeader": true,
      "delimiter": ","
    },
    "tsv": {
      "type": "text",
      "extensions": [
        "tsv"
      ],
      "delimiter": "\t"
    },
    "parquet": {
      "type": "parquet"
    },
    "json": {
      "type": "json",
      "extensions": [
        "json"
      ]
    },
    "avro": {
      "type": "avro"
    },
    "sequencefile": {
      "type": "sequencefile",
      "extensions": [
        "seq"
      ]
    },
    "csvh": {
      "type": "text",
      "extensions": [
        "csvh"
      ],
      "extractHeader": true,
      "delimiter": ","
    }
  }
}

On Fri, May 26, 2017 at 10:29 AM, Shuporno Choudhury <
shuporno.choudhury@manthan.com> wrote:

> Hi,
> Can someone at Drill help me with issue please?
>
> On Thu, May 25, 2017 at 1:33 PM, Shuporno Choudhury <
> shuporno.choudhury@manthan.com> wrote:
>
> > HI,
> >
> > I corrected the "show schemas"  output by putting only "/" in the
> > "location" . Now it shows s3.tmp in the output.
> >
> > But, it has a weird problem.
> > The moment I add a folder to the location, eg: "/myfolder", then s3.tmp
> > vanishes from the "show schemas" output.
> >
> > Also, when I try to write into s3, I get the following error:
> >
> > Exception in thread "drill-executor-9" java.lang.UnsatisfiedLinkError:
> > org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(
> > Ljava/lang/String;I)Z
> >         at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native
> > Method)+--+
> >         at org.apache.hadoop.io.nativeio.NativeIO$Windows.access(
> > NativeIO.java:609)
> >
> > This is only a snippet of the error associated with writing to s3
> >
> > On Thu, May 25, 2017 at 12:41 PM, Shuporno Choudhury <
> > shuporno.choudhury@manthan.com> wrote:
> >
> >> My s3 plugin info is as follows:
> >>
> >> {
> >>   "type": "file",
> >>   "enabled": true,
> >>   "connection": "s3a://abcd",
> >>   "config": {
> >>     "fs.s3a.access.key": "abcd",
> >>     "fs.s3a.secret.key": "abcd"
> >>   },
> >>   "workspaces": {
> >>     "root": {
> >>       "location": "/",
> >>       "writable": false,
> >>       "defaultInputFormat": null
> >>     },
> >>     "tmp": {
> >>       "location": "/",
> >>       "writable": *true*,
> >>       "defaultInputFormat": "parquet"
> >>     }
> >>   }
> >>
> >>
> >> I have removed the info about the formats to keep the mail small.
> >> Also, I am using Dill on *Windows 10*
> >>
> >> On Mon, May 22, 2017 at 3:57 PM, Shuporno Choudhury <
> >> shuporno.choudhury@manthan.com> wrote:
> >>
> >>> Hi,
> >>>
> >>> Is it possible to write to a folder in an s3 bucket using the *s3.tmp*
> >>> workspace?
> >>> Whenever I try, it gives me the follwing error:
> >>>
> >>> *Error: VALIDATION ERROR: Schema [s3.tmp] is not valid with respect to
> >>> either root schema or current default schema.*
> >>> *Current default schema:  s3.root*
> >>>
> >>> Also, s3.tmp doesn't appear while using the command "*show schemas*"
> >>> though the tmp workspace exists in the web console
> >>>
> >>> I am using Drill Version 1.10; embedded mode on my local system.
> >>>
> >>> However, I have no problem reading from an s3 bucket, the problem is
> >>> only writing to a s3 bucket.
> >>> --
> >>> Regards,
> >>> Shuporno Choudhury
> >>>
> >>
> >>
> >>
> >> --
> >> Regards,
> >> Shuporno Choudhury
> >>
> >
> >
> >
> > --
> > Regards,
> > Shuporno Choudhury
> >
>
>
>
> --
> Regards,
> Shuporno Choudhury
>



-- 
Nitin Pawar

Re: Writing to s3 using Drill

Posted by Shuporno Choudhury <sh...@manthan.com>.
Hi,
Can someone at Drill help me with issue please?

On Thu, May 25, 2017 at 1:33 PM, Shuporno Choudhury <
shuporno.choudhury@manthan.com> wrote:

> HI,
>
> I corrected the "show schemas"  output by putting only "/" in the
> "location" . Now it shows s3.tmp in the output.
>
> But, it has a weird problem.
> The moment I add a folder to the location, eg: "/myfolder", then s3.tmp
> vanishes from the "show schemas" output.
>
> Also, when I try to write into s3, I get the following error:
>
> Exception in thread "drill-executor-9" java.lang.UnsatisfiedLinkError:
> org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(
> Ljava/lang/String;I)Z
>         at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native
> Method)+--+
>         at org.apache.hadoop.io.nativeio.NativeIO$Windows.access(
> NativeIO.java:609)
>
> This is only a snippet of the error associated with writing to s3
>
> On Thu, May 25, 2017 at 12:41 PM, Shuporno Choudhury <
> shuporno.choudhury@manthan.com> wrote:
>
>> My s3 plugin info is as follows:
>>
>> {
>>   "type": "file",
>>   "enabled": true,
>>   "connection": "s3a://abcd",
>>   "config": {
>>     "fs.s3a.access.key": "abcd",
>>     "fs.s3a.secret.key": "abcd"
>>   },
>>   "workspaces": {
>>     "root": {
>>       "location": "/",
>>       "writable": false,
>>       "defaultInputFormat": null
>>     },
>>     "tmp": {
>>       "location": "/",
>>       "writable": *true*,
>>       "defaultInputFormat": "parquet"
>>     }
>>   }
>>
>>
>> I have removed the info about the formats to keep the mail small.
>> Also, I am using Dill on *Windows 10*
>>
>> On Mon, May 22, 2017 at 3:57 PM, Shuporno Choudhury <
>> shuporno.choudhury@manthan.com> wrote:
>>
>>> Hi,
>>>
>>> Is it possible to write to a folder in an s3 bucket using the *s3.tmp*
>>> workspace?
>>> Whenever I try, it gives me the follwing error:
>>>
>>> *Error: VALIDATION ERROR: Schema [s3.tmp] is not valid with respect to
>>> either root schema or current default schema.*
>>> *Current default schema:  s3.root*
>>>
>>> Also, s3.tmp doesn't appear while using the command "*show schemas*"
>>> though the tmp workspace exists in the web console
>>>
>>> I am using Drill Version 1.10; embedded mode on my local system.
>>>
>>> However, I have no problem reading from an s3 bucket, the problem is
>>> only writing to a s3 bucket.
>>> --
>>> Regards,
>>> Shuporno Choudhury
>>>
>>
>>
>>
>> --
>> Regards,
>> Shuporno Choudhury
>>
>
>
>
> --
> Regards,
> Shuporno Choudhury
>



-- 
Regards,
Shuporno Choudhury

Re: Writing to s3 using Drill

Posted by Shuporno Choudhury <sh...@manthan.com>.
HI,

I corrected the "show schemas"  output by putting only "/" in the
"location" . Now it shows s3.tmp in the output.

But, it has a weird problem.
The moment I add a folder to the location, eg: "/myfolder", then s3.tmp
vanishes from the "show schemas" output.

Also, when I try to write into s3, I get the following error:

Exception in thread "drill-executor-9" java.lang.UnsatisfiedLinkError:
org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z
        at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native
Method)+--+
        at
org.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:609)

This is only a snippet of the error associated with writing to s3

On Thu, May 25, 2017 at 12:41 PM, Shuporno Choudhury <
shuporno.choudhury@manthan.com> wrote:

> My s3 plugin info is as follows:
>
> {
>   "type": "file",
>   "enabled": true,
>   "connection": "s3a://abcd",
>   "config": {
>     "fs.s3a.access.key": "abcd",
>     "fs.s3a.secret.key": "abcd"
>   },
>   "workspaces": {
>     "root": {
>       "location": "/",
>       "writable": false,
>       "defaultInputFormat": null
>     },
>     "tmp": {
>       "location": "/",
>       "writable": *true*,
>       "defaultInputFormat": "parquet"
>     }
>   }
>
>
> I have removed the info about the formats to keep the mail small.
> Also, I am using Dill on *Windows 10*
>
> On Mon, May 22, 2017 at 3:57 PM, Shuporno Choudhury <
> shuporno.choudhury@manthan.com> wrote:
>
>> Hi,
>>
>> Is it possible to write to a folder in an s3 bucket using the *s3.tmp*
>> workspace?
>> Whenever I try, it gives me the follwing error:
>>
>> *Error: VALIDATION ERROR: Schema [s3.tmp] is not valid with respect to
>> either root schema or current default schema.*
>> *Current default schema:  s3.root*
>>
>> Also, s3.tmp doesn't appear while using the command "*show schemas*"
>> though the tmp workspace exists in the web console
>>
>> I am using Drill Version 1.10; embedded mode on my local system.
>>
>> However, I have no problem reading from an s3 bucket, the problem is only
>> writing to a s3 bucket.
>> --
>> Regards,
>> Shuporno Choudhury
>>
>
>
>
> --
> Regards,
> Shuporno Choudhury
>



-- 
Regards,
Shuporno Choudhury

Re: Writing to s3 using Drill

Posted by Shuporno Choudhury <sh...@manthan.com>.
My s3 plugin info is as follows:

{
  "type": "file",
  "enabled": true,
  "connection": "s3a://abcd",
  "config": {
    "fs.s3a.access.key": "abcd",
    "fs.s3a.secret.key": "abcd"
  },
  "workspaces": {
    "root": {
      "location": "/",
      "writable": false,
      "defaultInputFormat": null
    },
    "tmp": {
      "location": "/",
      "writable": *true*,
      "defaultInputFormat": "parquet"
    }
  }


I have removed the info about the formats to keep the mail small.
Also, I am using Dill on *Windows 10*

On Mon, May 22, 2017 at 3:57 PM, Shuporno Choudhury <
shuporno.choudhury@manthan.com> wrote:

> Hi,
>
> Is it possible to write to a folder in an s3 bucket using the *s3.tmp*
> workspace?
> Whenever I try, it gives me the follwing error:
>
> *Error: VALIDATION ERROR: Schema [s3.tmp] is not valid with respect to
> either root schema or current default schema.*
> *Current default schema:  s3.root*
>
> Also, s3.tmp doesn't appear while using the command "*show schemas*"
> though the tmp workspace exists in the web console
>
> I am using Drill Version 1.10; embedded mode on my local system.
>
> However, I have no problem reading from an s3 bucket, the problem is only
> writing to a s3 bucket.
> --
> Regards,
> Shuporno Choudhury
>



-- 
Regards,
Shuporno Choudhury

Re: Writing to s3 using Drill

Posted by Abhishek Girish <ag...@apache.org>.
Sorry, I was wrong - please ignore my previous message. Looks like we do
support writing to S3, but there were small differences necessary to make
this work:

First, I had to prefix the CTAS table name with the S3 plugin name. And
second, I had to either update the s3 storage plugin configuration to
include the default workspace and set writable to true, or create a
workspace with a path and set the writable option to true.

Example:

create table s3.abc.a_ctas as select * from s3.a

   "abc": {
      "location": "/a",
      "writable": true,
      "defaultInputFormat": null
    }

OR

create table s3.a_ctas as select * from s3.a

    "default": {
      "location": "/",
      "writable": true,
      "defaultInputFormat": null
    }



On Wed, May 24, 2017 at 12:22 PM, Abhishek Girish <ag...@apache.org>
wrote:

> I don't think we support writing to Object stores such as S3. We do
> support reading from S3 buckets via the S3a library. However, we have
> limited support with the plugin. You could file a enhancement request on
> JIRA [1].
>
> If someone has any experience with it, they can share details on the JIRA, or
> work on it. You are welcome to contribute yourself.
>
> [1] https://issues.apache.org/jira/browse/DRILL
>
> On Mon, May 22, 2017 at 3:27 AM, Shuporno Choudhury <
> shuporno.choudhury@manthan.com> wrote:
>
>> Hi,
>>
>> Is it possible to write to a folder in an s3 bucket using the *s3.tmp*
>> workspace?
>> Whenever I try, it gives me the follwing error:
>>
>> *Error: VALIDATION ERROR: Schema [s3.tmp] is not valid with respect to
>> either root schema or current default schema.*
>> *Current default schema:  s3.root*
>>
>> Also, s3.tmp doesn't appear while using the command "*show schemas*"
>> though
>> the tmp workspace exists in the web console
>>
>> I am using Drill Version 1.10; embedded mode on my local system.
>>
>> However, I have no problem reading from an s3 bucket, the problem is only
>> writing to a s3 bucket.
>> --
>> Regards,
>> Shuporno Choudhury
>>
>
>

Re: Writing to s3 using Drill

Posted by Abhishek Girish <ag...@apache.org>.
I don't think we support writing to Object stores such as S3. We do support
reading from S3 buckets via the S3a library. However, we have limited
support with the plugin. You could file a enhancement request on JIRA [1].

If someone has any experience with it, they can share details on the JIRA, or
work on it. You are welcome to contribute yourself.

[1] https://issues.apache.org/jira/browse/DRILL

On Mon, May 22, 2017 at 3:27 AM, Shuporno Choudhury <
shuporno.choudhury@manthan.com> wrote:

> Hi,
>
> Is it possible to write to a folder in an s3 bucket using the *s3.tmp*
> workspace?
> Whenever I try, it gives me the follwing error:
>
> *Error: VALIDATION ERROR: Schema [s3.tmp] is not valid with respect to
> either root schema or current default schema.*
> *Current default schema:  s3.root*
>
> Also, s3.tmp doesn't appear while using the command "*show schemas*" though
> the tmp workspace exists in the web console
>
> I am using Drill Version 1.10; embedded mode on my local system.
>
> However, I have no problem reading from an s3 bucket, the problem is only
> writing to a s3 bucket.
> --
> Regards,
> Shuporno Choudhury
>

Re: Writing to s3 using Drill

Posted by Sorabh Hamirwasia <sh...@mapr.com>.
Hi Shuporno,

Can you please share your S3 plugin configuration ? Looks like in your configuration you might be missing something like below:


"tmp": {
      "location": "drill-tmp",
      "writable": true,
      "defaultInputFormat": null

 }


Thanks,
Sorabh


________________________________
From: Shuporno Choudhury <sh...@manthan.com>
Sent: Monday, May 22, 2017 3:27 AM
To: user@drill.apache.org
Subject: Writing to s3 using Drill

Hi,

Is it possible to write to a folder in an s3 bucket using the *s3.tmp*
workspace?
Whenever I try, it gives me the follwing error:

*Error: VALIDATION ERROR: Schema [s3.tmp] is not valid with respect to
either root schema or current default schema.*
*Current default schema:  s3.root*

Also, s3.tmp doesn't appear while using the command "*show schemas*" though
the tmp workspace exists in the web console

I am using Drill Version 1.10; embedded mode on my local system.

However, I have no problem reading from an s3 bucket, the problem is only
writing to a s3 bucket.
--
Regards,
Shuporno Choudhury