You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Saulo Ricci (JIRA)" <ji...@apache.org> on 2017/01/08 01:30:58 UTC
[jira] [Created] (SPARK-19123) KeyProviderException when reading Azure Blobs from Apache Spark

Saulo Ricci created SPARK-19123:
-----------------------------------

             Summary: KeyProviderException when reading Azure Blobs from Apache Spark
                 Key: SPARK-19123
                 URL: https://issues.apache.org/jira/browse/SPARK-19123
             Project: Spark
          Issue Type: Question
          Components: Input/Output, Java API
    Affects Versions: 2.0.0
         Environment: Apache Spark 2.0.0 running on Azure HDInsight cluster version 3.5 with Hadoop version 2.7.3
            Reporter: Saulo Ricci
            Priority: Critical


I created a Spark job and it's intended to read a set of json files from a Azure Blob container. I set the key and reference to my storage and I'm reading the files as showed in the snippet bellow:

{code:java}
    SparkSession
        sparkSession =
        SparkSession.builder().appName("Pipeline")
            .master("yarn")
            .config("fs.azure", "org.apache.hadoop.fs.azure.NativeAzureFileSystem")
            .config("fs.azure.account.key.<storage_name>.blob.core.windows.net","<storage_key>")
            .getOrCreate();

    Dataset<Row> txs = sparkSession.read().json("wasb://path_to_files");
{code}
The point is that I'm unfortunately getting a `org.apache.hadoop.fs.azure.KeyProviderException` when reading the blobs from the azure storage. According to the trace showed bellow it seems the header too long but still trying to figure out what exactly that means:

{code:java}
    17/01/07 19:28:39 ERROR ApplicationMaster: User class threw exception: org.apache.hadoop.fs.azure.AzureException: org.apache.hadoop.fs.azure.KeyProviderException: ExitCodeException exitCode=2: Error reading S/MIME message
    140473279682200:error:0D07207B:asn1 encoding routines:ASN1_get_object:header too long:asn1_lib.c:157:
    140473279682200:error:0D0D106E:asn1 encoding routines:B64_READ_ASN1:decode error:asn_mime.c:192:
    140473279682200:error:0D0D40CB:asn1 encoding routines:SMIME_read_ASN1:asn1 parse error:asn_mime.c:517:

    org.apache.hadoop.fs.azure.AzureException: org.apache.hadoop.fs.azure.KeyProviderException: ExitCodeException exitCode=2: Error reading S/MIME message
    140473279682200:error:0D07207B:asn1 encoding routines:ASN1_get_object:header too long:asn1_lib.c:157:
    140473279682200:error:0D0D106E:asn1 encoding routines:B64_READ_ASN1:decode error:asn_mime.c:192:
    140473279682200:error:0D0D40CB:asn1 encoding routines:SMIME_read_ASN1:asn1 parse error:asn_mime.c:517:

	at org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.createAzureStorageSession(AzureNativeFileSystemStore.java:953)
	at org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.initialize(AzureNativeFileSystemStore.java:450)
	at org.apache.hadoop.fs.azure.NativeAzureFileSystem.initialize(NativeAzureFileSystem.java:1209)
	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2761)
	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99)
	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2795)
	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2777)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:386)
	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
	at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$12.apply(DataSource.scala:366)
	at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$12.apply(DataSource.scala:364)
	at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
	at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
	at scala.collection.immutable.List.foreach(List.scala:381)
	at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
	at scala.collection.immutable.List.flatMap(List.scala:344)
	at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:364)
	at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:149)
	at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:294)
	at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:249)
	at taka.pipelines.AnomalyTrainingPipeline.main(AnomalyTrainingPipeline.java:35)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:627)
    Caused by: org.apache.hadoop.fs.azure.KeyProviderException: ExitCodeException exitCode=2: Error reading S/MIME message
{code}
I'm using a Apache Spark 2.0.0 cluster set on top of a HDInsight Azure cluster and I'd like to find a solution to this problem. I appreciate any suggestion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org