You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Shuai Lin (JIRA)" <ji...@apache.org> on 2017/01/08 04:10:58 UTC

[jira] [Commented] (SPARK-19123) KeyProviderException when reading Azure Blobs from Apache Spark

    [ https://issues.apache.org/jira/browse/SPARK-19123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15808663#comment-15808663 ] 

Shuai Lin commented on SPARK-19123:
-----------------------------------

IIUC {{KeyProviderException}} means the storage account key is not configured properly. Are you sure the way you specify the key is correct? Have you checked the azure developer docs for it?

BTW I don't think this is an "critical issue", so I changed it to "minor".

> KeyProviderException when reading Azure Blobs from Apache Spark
> ---------------------------------------------------------------
>
>                 Key: SPARK-19123
>                 URL: https://issues.apache.org/jira/browse/SPARK-19123
>             Project: Spark
>          Issue Type: Question
>          Components: Input/Output, Java API
>    Affects Versions: 2.0.0
>         Environment: Apache Spark 2.0.0 running on Azure HDInsight cluster version 3.5 with Hadoop version 2.7.3
>            Reporter: Saulo Ricci
>            Priority: Minor
>              Labels: newbie
>
> I created a Spark job and it's intended to read a set of json files from a Azure Blob container. I set the key and reference to my storage and I'm reading the files as showed in the snippet bellow:
> {code:java}
>     SparkSession
>         sparkSession =
>         SparkSession.builder().appName("Pipeline")
>             .master("yarn")
>             .config("fs.azure", "org.apache.hadoop.fs.azure.NativeAzureFileSystem")
>             .config("fs.azure.account.key.<storage_name>.blob.core.windows.net","<storage_key>")
>             .getOrCreate();
>     Dataset<Row> txs = sparkSession.read().json("wasb://path_to_files");
> {code}
> The point is that I'm unfortunately getting a `org.apache.hadoop.fs.azure.KeyProviderException` when reading the blobs from the azure storage. According to the trace showed bellow it seems the header too long but still trying to figure out what exactly that means:
> {code:java}
>     17/01/07 19:28:39 ERROR ApplicationMaster: User class threw exception: org.apache.hadoop.fs.azure.AzureException: org.apache.hadoop.fs.azure.KeyProviderException: ExitCodeException exitCode=2: Error reading S/MIME message
>     140473279682200:error:0D07207B:asn1 encoding routines:ASN1_get_object:header too long:asn1_lib.c:157:
>     140473279682200:error:0D0D106E:asn1 encoding routines:B64_READ_ASN1:decode error:asn_mime.c:192:
>     140473279682200:error:0D0D40CB:asn1 encoding routines:SMIME_read_ASN1:asn1 parse error:asn_mime.c:517:
>     org.apache.hadoop.fs.azure.AzureException: org.apache.hadoop.fs.azure.KeyProviderException: ExitCodeException exitCode=2: Error reading S/MIME message
>     140473279682200:error:0D07207B:asn1 encoding routines:ASN1_get_object:header too long:asn1_lib.c:157:
>     140473279682200:error:0D0D106E:asn1 encoding routines:B64_READ_ASN1:decode error:asn_mime.c:192:
>     140473279682200:error:0D0D40CB:asn1 encoding routines:SMIME_read_ASN1:asn1 parse error:asn_mime.c:517:
> 	at org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.createAzureStorageSession(AzureNativeFileSystemStore.java:953)
> 	at org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.initialize(AzureNativeFileSystemStore.java:450)
> 	at org.apache.hadoop.fs.azure.NativeAzureFileSystem.initialize(NativeAzureFileSystem.java:1209)
> 	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2761)
> 	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99)
> 	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2795)
> 	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2777)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:386)
> 	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
> 	at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$12.apply(DataSource.scala:366)
> 	at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$12.apply(DataSource.scala:364)
> 	at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
> 	at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
> 	at scala.collection.immutable.List.foreach(List.scala:381)
> 	at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
> 	at scala.collection.immutable.List.flatMap(List.scala:344)
> 	at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:364)
> 	at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:149)
> 	at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:294)
> 	at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:249)
> 	at taka.pipelines.AnomalyTrainingPipeline.main(AnomalyTrainingPipeline.java:35)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:498)
> 	at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:627)
>     Caused by: org.apache.hadoop.fs.azure.KeyProviderException: ExitCodeException exitCode=2: Error reading S/MIME message
> {code}
> I'm using a Apache Spark 2.0.0 cluster set on top of a HDInsight Azure cluster and I'd like to find a solution to this problem. I appreciate any suggestion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org