You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Phillip Henry <lo...@gmail.com> on 2020/02/27 11:50:53 UTC

hadoop-azure: "StorageException: The specified Rest Version is Unsupported"

I've built Spark 3.0.0-preview2 with the -Phadoop-3.2 profile switch and
deployed it via Kubernetes.

I launch Spark with a switch to pull in the relevant Hadoop/Azure
dependencies:

 --packages
org.apache.hadoop:hadoop-azure:3.2.0,org.apache.hadoop:hadoop-azure-datalake:3.2.0

and see that com.microsoft.azure#azure-storage;7.0.0 is indeed pulled in.

I can see files using a blob.core.windows.net URL but the
dfs.core.windows.net throws an Exception saying "The specified Rest Version
is Unsupported".

I use tcpdump and see that my client is indeed using:

x-ms-version: 2017-07-29

in its HTTP headers.

If I upgrade to azure-storage:8.6.0, I see in the HTTP headers:

x-ms-version: 2019-02-02

and the job gets slightly further but reading the Parquet file now fails
with "Incorrect Blob type, please use the correct Blob type to access a
blob on the server. Expected BLOCK_BLOB, actual UNSPECIFIED".

This is not overly surprising as I am shoe-horning in a binary that Hadoop
was unprepared for. I just did this to demonstrate that this version of the
library seems to talk to Azure as its version is more recent.

Does anybody have any ideas on how I can talk to Azure?

[Note: for various non-technical reasons, I cannot use HDInsight nor
DataBricks.]

Kind regards,

Phillip

Re: hadoop-azure: "StorageException: The specified Rest Version is Unsupported"

Posted by Phillip Henry <lo...@gmail.com>.
Solved my own problem.

Here's what I did for future reference: use OAuth so:

spark.conf.set("fs.azure.account.auth.type",
"OAuth")
spark.conf.set("fs.azure.account.oauth2.client.secret",
 SECRET)
spark.conf.set("fs.azure.account.oauth2.client.id" ,                  APP_ID
)
spark.conf.set("fs.azure.createRemoteFileSystemDuringInitialization",
"true")
spark.conf.set("fs.azure.account.oauth.provider.type",
"org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider")
spark.conf.set("fs.azure.account.oauth2.client.endpoint", "
https://login.microsoftonline.com/" + TENANT + "/oauth2/token")
spark.conf.set("fs.azure.account.auth.type." + accountName + ".
dfs.core.windows.net", "SharedKey")
spark.conf.set("fs.azure.account.key."       + accountName + ".
dfs.core.windows.net", accountKey)

where
The tenant is the ID of our active directory in Azure
The app id (also known as client id) is the ID of the service principal
The secret is something you create under the service principal which you
use to authenticate (i.e. a password)

Phillip



On Thu, Feb 27, 2020 at 11:50 AM Phillip Henry <lo...@gmail.com>
wrote:

> I've built Spark 3.0.0-preview2 with the -Phadoop-3.2 profile switch and
> deployed it via Kubernetes.
>
> I launch Spark with a switch to pull in the relevant Hadoop/Azure
> dependencies:
>
>  --packages
> org.apache.hadoop:hadoop-azure:3.2.0,org.apache.hadoop:hadoop-azure-datalake:3.2.0
>
> and see that com.microsoft.azure#azure-storage;7.0.0 is indeed pulled in.
>
> I can see files using a blob.core.windows.net URL but the
> dfs.core.windows.net throws an Exception saying "The specified Rest
> Version is Unsupported".
>
> I use tcpdump and see that my client is indeed using:
>
> x-ms-version: 2017-07-29
>
> in its HTTP headers.
>
> If I upgrade to azure-storage:8.6.0, I see in the HTTP headers:
>
> x-ms-version: 2019-02-02
>
> and the job gets slightly further but reading the Parquet file now fails
> with "Incorrect Blob type, please use the correct Blob type to access a
> blob on the server. Expected BLOCK_BLOB, actual UNSPECIFIED".
>
> This is not overly surprising as I am shoe-horning in a binary that Hadoop
> was unprepared for. I just did this to demonstrate that this version of the
> library seems to talk to Azure as its version is more recent.
>
> Does anybody have any ideas on how I can talk to Azure?
>
> [Note: for various non-technical reasons, I cannot use HDInsight nor
> DataBricks.]
>
> Kind regards,
>
> Phillip
>
>