You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Barber, Sacha" <Sa...@tpicap.com> on 2022/11/17 13:14:44 UTC

access point query

I have successfully set up an s3 access point which works for me using


  *   AWS CLI
  *   AW C# SDK

I would like to use Apache Drill and I have seen it supports access points here : https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html#accesspoints

So I currently have this in the Drill s3 storage setup, which as far as I can tell is correct

[cid:image001.png@01D8FA85.98A58B90]


Where the following is true


  *   "Connection" points ot the actual s3 location
  *   fs.s3a.binli-qlik-test-bucket.accesspoint.arn : is my specific access point ARN for that bucket, which seems to be what the docs show here
[cid:image004.png@01D8FA85.D6A547A0]


In my core-site.xml I have the following lines commented out

<!--
    <property>
        <name>fs.s3a.access.key</name>
        <value>AKIA6QKOBPZIAWLWD5OZ</value>
    </property>

    <property>
        <name>fs.s3a.secret.key</name>
        <value>4/DPE5E4YeBRilJklWlt5nBEA6T/PKGMsxA1TyjR</value>
    </property>

            <property>
           <name>fs.s3a.endpoint</name>
           <value>s3.eu-west-1.amazonaws.com</value>
       </property>
    -->


So  when I run this query in the web UI

select * from s3.`subfolder` limit 10

I get this error:

[cid:image005.png@01D8FA86.8C551EE0]



As I say this works in 2 other demos outside of Apache Drill. What am I doing wrong with Acess Poinits there is not many web pages explaning this. I am a bit lost




Kind regards,
Sacha Barber
Technical Lead

TP ICAP

[cid:image006.png@01D8FA86.8C551EE0]<http://www.tpicap.com/>



Classification: Public
This communication is sent by TP ICAP plc on behalf of the ICAP Energy Limited Tullett Prebon (Europe) Limited and Spectron Energy Services Limited (together the “Tankard Parties”). All information contained in or attached to it (including, but not limited to market prices/levels and market commentary) (the "Information") is for information purposes only, is confidential, may be legally privileged and is the intellectual property of the Tankard Parties. The Information is subject to any terms of business as published or communicated to clients from time to time and is directed to Eligible Counterparties and Professional Customers only and is not intended for Retail Clients (as each term is defined by the rules of the Financial Conduct Authority). The Information is not, and should not be construed as, an offer, bid, recommendation or solicitation in relation to any financial instrument or investment or to participate in any particular trading strategy. The Information is not to be relied upon and is not warranted, including, but not limited, as to completeness, timeliness or accuracy and is subject to change without notice.

All representations and warranties are expressly disclaimed. Access to the Information by anyone other than the intended recipient is unauthorised and any disclosure, copying or redistribution is prohibited. For further regulatory information please see www.icap.com. If you receive this message in error, please immediately delete all copies of it and notify the sender.

We have taken precautions to minimise the risk of transmitting software viruses, but we advise you to carry out your own virus checks on any attachment to this message. We cannot accept liability for any loss or damage caused by software viruses.

Tullett Prebon (Europe) Limited and ICAP Energy Limited are part of TP ICAP group. TP ICAP Group plc | Registered in Jersey | Company number 130617. For further information please visit www.tpicap.com.

RE: [External email] Re: access point query

Posted by "Barber, Sacha" <Sa...@tpicap.com>.
Ah ok, let me try again with text shown below instead (btw I am working on Windows).

I previously had drill working fine using direct s3 bucket access, anyway see revised version below

Kind regards,
Sacha Barber
Technical Lead




Classification: Public
From: James Turton <dz...@apache.org>
Sent: 17 November 2022 13:35
To: dev@drill.apache.org; Barber, Sacha <Sa...@tpicap.com>
Cc: Singleton, Owen <os...@liquidnet.com>
Subject: [External email] Re: access point query



Warning. This email is from outside of the organisation. Please do not click links or open attachments unless you recognize the source of this email and know the content is safe.


________________________________
Hi Sacha

This mailing list does not support images in emails so we can't see a lot of what you've sent. Where the images are screenshots of text can you include the text instead? If there are still cases where you want to send an image rather than text, please use a paste bin service and include hyperlinks in your email.

Thanks
James
On 2022/11/17 15:14, Barber, Sacha wrote:
I have successfully set up an s3 access point which works for me using


  1.  AWS CLI
  2.  AW C# SDK

I would like to use Apache Drill and I have seen it supports access points here : https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html#accesspoints<https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html#accesspoints>

So I currently have this in the Drill s3 storage setup, which as far as I can tell is correct

{
  "type": "file",
  "connection": "s3a://binli-qlik-test-bucket/",
  "config": {
    "fs.s3a.secret.key": "REDACTED",
    "fs.s3a.access.key": " REDACTED ",
    "fs.s3a.binli-qlik-test-bucket.accesspoint.arn": "arn:aws:s3:eu-west-1:997132959312:accesspoint/binli-qlik-test-ap",
    "fs.s3a.accesspoint.required": "true"
  },
  "workspaces": {
    "tmp": {
      "location": "C:/Users/sbarber-a/drill/",
      "writable": true,
      "defaultInputFormat": null,
      "allowAccessOutsideWorkspace": true
    },
    "root": {
      "location": "C:/Users/sbarber-a/drill/",
      "writable": false,
      "defaultInputFormat": null,
      "allowAccessOutsideWorkspace": true
    }
  },
  "formats": {
    "parquet": {
      "type": "parquet"
    },
    "avro": {
      "type": "avro",
      "extensions": [
        "avro"
      ]
    },
    "json": {
      "type": "json",
      "extensions": [
        "json"
      ]
    },
    "csvh": {
      "type": "text",
      "extensions": [
        "csvh"
      ],
      "extractHeader": true
    },
    "sequencefile": {
      "type": "sequencefile",
      "extensions": [
        "seq"
      ]
    },
    "psv": {
      "type": "text",
      "extensions": [
        "tbl"
      ],
      "fieldDelimiter": "|"
    },
    "tsv": {
      "type": "text",
      "extensions": [
        "tsv"
      ],
      "fieldDelimiter": "\t"
    },
    "csv": {
      "type": "text",
      "extensions": [
        "csv"
      ]
    },
    "shp": {
      "type": "shp",
      "extensions": [
        "shp"
      ]
    },
    "hdf5": {
      "type": "hdf5",
      "extensions": [
        "h5"
      ],
      "defaultPath": null
    },
    "spss": {
      "type": "spss",
      "extensions": [
        "sav"
      ]
    },
    "xml": {
      "type": "xml",
      "extensions": [
        "xml"
      ],
      "dataLevel": 1
    },
    "excel": {
      "type": "excel",
      "extensions": [
        "xlsx"
      ],
      "lastRow": 1048576
    },
    "pcap": {
      "type": "pcap",
      "extensions": [
        "pcap",
        "pcapng"
      ]
    },
    "ltsv": {
      "type": "ltsv",
      "extensions": [
        "ltsv"
      ]
    },
    "image": {
      "type": "image",
      "extensions": [
        "jpg",
        "jpeg",
        "jpe",
        "tif",
        "tiff",
        "dng",
        "psd",
        "png",
        "bmp",
        "gif",
        "ico",
        "pcx",
        "wav",
        "wave",
        "avi",
        "webp",
        "mov",
        "mp4",
        "m4a",
        "m4p",
        "m4b",
        "m4r",
        "m4v",
        "3gp",
        "3g2",
        "eps",
        "epsf",
        "epsi",
        "ai",
        "arw",
        "crw",
        "cr2",
        "nef",
        "orf",
        "raf",
        "rw2",
        "rwl",
        "srw",
        "x3f"
      ],
      "fileSystemMetadata": true,
      "descriptive": true
    },
    "syslog": {
      "type": "syslog",
      "extensions": [
        "syslog"
      ],
      "maxErrors": 10
    },
    "httpd": {
      "type": "httpd",
      "extensions": [
        "httpd"
      ],
      "logFormat": "common\ncombined"
    }
  },
  "enabled": true
}


Where the following is true


  1.  "Connection" points ot the actual s3 location
  2.  fs.s3a.binli-qlik-test-bucket.accesspoint.arn : is my specific access point ARN for that bucket, which seems to be what the docs show here

https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html this section Configuring S3 AccessPoints usage with S3A




In my core-site.xml I have the following lines commented out

<!--
    <property>
        <name>fs.s3a.access.key</name>
        <value>AKIA6QKOBPZIAWLWD5OZ</value>
    </property>

    <property>
        <name>fs.s3a.secret.key</name>
        <value>4/DPE5E4YeBRilJklWlt5nBEA6T/PKGMsxA1TyjR</value>
    </property>

            <property>
           <name>fs.s3a.endpoint</name>
           <value>s3.eu-west-1.amazonaws.com</value>
       </property>
    -->


So  when I run this query in the web UI

select * from s3.`subfolder` limit 10

I get this error:





As I say this works in 2 other demos outside of Apache Drill. What am I doing wrong with Acess Poinits there is not many web pages explaning this. I am a bit lost
 The access point I have setup points to a bucket which has a subfolder called "open" which has a further sub folder called "subfolder" so I have tried these
select * from s3.`subfolder` limit 10
select * from s3.`open/`.`subfolder` limit 10
 select * from s3.`open`.`subfolder` limit 10
All of which seem to result in this error
org.apache.drill.common.exceptions.UserRemoteException: RESOURCE ERROR: Failed to load schema for "s3"! java.nio.file.AccessDeniedException: s3a://binli-qlik-test-bucket/C:/Users/sbarber-a/drill: getFileStatus on s3a://binli-qlik-test-bucket/C:/Users/sbarber-a/drill: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: P9J1QKQ55PX19G45; S3 Extended Request ID: e5B9m8FYUpZ1Mp7B3p1DtZ2tlLr98ziYKYUgYJlfLVw7x74rwPGiPDrWRFtG3fUJ580p/GyT7ZY=), S3 Extended Request ID: e5B9m8FYUpZ1Mp7B3p1DtZ2tlLr98ziYKYUgYJlfLVw7x74rwPGiPDrWRFtG3fUJ580p/GyT7ZY=:403 Forbidden [Error Id: 7c4fdfbe-134f-4e4c-b5b1-2ff2131b662a ]


Kind regards,
Sacha Barber
Technical Lead

TP ICAP




Classification: Public

Honesty | Integrity | Respect | Excellence
________________________________

The information contained in this email is confidential and may also contain privileged information. Sender does not waive confidentiality or legal privilege. If you are not the intended recipient please notify the sender immediately; you should not retain this message or disclose its content to anyone.

Internet communications are not secure or error free and the sender does not accept any liability for the content of the email. Although emails are routinely screened for viruses, the sender does not accept responsibility for any damage caused. Replies to this email may be monitored.

TP ICAP Group plc | Registered in Jersey | Company number 130617. Please visit www.tpicap.com<http://www.tpicap.com> for further information on TP ICAP group.

________________________________
This communication is sent by TP ICAP plc on behalf of the ICAP Energy Limited Tullett Prebon (Europe) Limited and Spectron Energy Services Limited (together the “Tankard Parties”). All information contained in or attached to it (including, but not limited to market prices/levels and market commentary) (the "Information") is for information purposes only, is confidential, may be legally privileged and is the intellectual property of the Tankard Parties. The Information is subject to any terms of business as published or communicated to clients from time to time and is directed to Eligible Counterparties and Professional Customers only and is not intended for Retail Clients (as each term is defined by the rules of the Financial Conduct Authority). The Information is not, and should not be construed as, an offer, bid, recommendation or solicitation in relation to any financial instrument or investment or to participate in any particular trading strategy. The Information is not to be relied upon and is not warranted, including, but not limited, as to completeness, timeliness or accuracy and is subject to change without notice.

All representations and warranties are expressly disclaimed. Access to the Information by anyone other than the intended recipient is unauthorised and any disclosure, copying or redistribution is prohibited. For further regulatory information please see www.icap.com. If you receive this message in error, please immediately delete all copies of it and notify the sender.

We have taken precautions to minimise the risk of transmitting software viruses, but we advise you to carry out your own virus checks on any attachment to this message. We cannot accept liability for any loss or damage caused by software viruses.

Tullett Prebon (Europe) Limited and ICAP Energy Limited are part of TP ICAP group. TP ICAP Group plc | Registered in Jersey | Company number 130617. For further information please visit www.tpicap.com.

Re: access point query

Posted by James Turton <dz...@apache.org>.
Hi Sacha

This mailing list does not support images in emails so we can't see a 
lot of what you've sent. Where the images are screenshots of text can 
you include the text instead? If there are still cases where you want to 
send an image rather than text, please use a paste bin service and 
include hyperlinks in your email.

Thanks
James

On 2022/11/17 15:14, Barber, Sacha wrote:
>
> I have successfully set up an s3 access point which works for me using
>
>   * AWS CLI
>   * AW C# SDK
>
> I would like to use Apache Drill and I have seen it supports access 
> points here : 
> https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html#accesspoints
>
> So I currently have this in the Drill s3 storage setup, which as far 
> as I can tell is correct
>
> Where the following is true
>
>   * “Connection” points ot the actual s3 location
>   * fs.s3a.binli-qlik-test-bucket.accesspoint.arn : is my specific
>     access point ARN for that bucket, which seems to be what the docs
>     show here
>
> In my core-site.xml I have the following lines commented out
>
> <!--
>
>     <property>
>
> <name>fs.s3a.access.key</name>
>
> <value>AKIA6QKOBPZIAWLWD5OZ</value>
>
>     </property>
>
>     <property>
>
> <name>fs.s3a.secret.key</name>
>
> <value>4/DPE5E4YeBRilJklWlt5nBEA6T/PKGMsxA1TyjR</value>
>
>     </property>
>
>             <property>
>
> <name>fs.s3a.endpoint</name>
>
> <value>s3.eu-west-1.amazonaws.com</value>
>
>        </property>
>
>     -->
>
> So  when I run this query in the web UI
>
> *select * from s3.`subfolder` limit 10*
>
> I get this error:
>
> As I say this works in 2 other demos outside of Apache Drill. What am 
> I doing wrong with Acess Poinits there is not many web pages explaning 
> this. I am a bit lost
>
> Kind regards,
>
>
> Sacha Barber
>
> Technical Lead
>
> TP ICAP
>
> <http://www.tpicap.com/>
>
>
> Classification: Public
>
>
> *Honesty | Integrity | Respect | Excellence *
>
> ------------------------------------------------------------------------
>
> The information contained in this email is confidential and may also 
> contain privileged information. Sender does not waive confidentiality 
> or legal privilege. If you are not the intended recipient please 
> notify the sender immediately; you should not retain this message or 
> disclose its content to anyone.
>
> Internet communications are not secure or error free and the sender 
> does not accept any liability for the content of the email. Although 
> emails are routinely screened for viruses, the sender does not accept 
> responsibility for any damage caused. Replies to this email may be 
> monitored.
>
> TP ICAP Group plc | Registered in Jersey | Company number 130617. 
> Please visit www.tpicap.com for further information on TP ICAP group.
>
> ------------------------------------------------------------------------