You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (JIRA)" <ji...@apache.org> on 2018/12/17 11:02:00 UTC

[jira] [Comment Edited] (HADOOP-16005) NativeAzureFileSystem does not support setXAttr

    [ https://issues.apache.org/jira/browse/HADOOP-16005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16722876#comment-16722876 ] 

Steve Loughran edited comment on HADOOP-16005 at 12/17/18 11:01 AM:
--------------------------------------------------------------------

I should add: serving up the etag as the file checksum would be nice —lets you do backups which use a change in the etag as the sign of a file being out of date

 Look at

* class to describe the etag hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/store/EtagChecksum.java
 * HADOOP-13282 is the change to S3A to add this; HADOOP-15287 the discovery we'd better make it optional to stop distcp backups from HDFS failing, as too many jobs weren't using {{-skipCrc}} on the command line, it 


was (Author: stevel@apache.org):
I should add: serving up the etag as the file checksum would be nice —lets you do backups which use a change in the etag as the sign of a file being out of date

 Look at

* class to describe the etag hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/store/EtagChecksum.java
 *HADOOP-13282 is the change to S3A to add this; HADOOP-15287 the discovery we'd better make it optional to stop distcp backups from HDFS failing, as too many jobs weren't using {{-skipCrc}} on the command line, it 

> NativeAzureFileSystem does not support setXAttr
> -----------------------------------------------
>
>                 Key: HADOOP-16005
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16005
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/azure
>            Reporter: Clemens Wolff
>            Priority: Major
>
> When interacting with Azure Blob Storage via the Hadoop FileSystem client, it's currently (as of [a8bbd81|https://github.com/apache/hadoop/commit/a8bbd818d5bc4762324bcdb7cf1fdd5c2f93891b]) not possible to set custom metadata attributes.
> Here is a snippet that demonstrates the missing behavior (throws an UnsupportedOperationException):
> {code:java}
> val blobAccount = "SET ME"
> val blobKey = "SET ME"
> val blobContainer = "SET ME"
> val blobFile = "SET ME"
> import org.apache.hadoop.conf.Configuration
> import org.apache.hadoop.fs.{FileSystem, Path}
> val conf = new Configuration()
> conf.set("fs.wasbs.impl", "org.apache.hadoop.fs.azure.NativeAzureFileSystem")
> conf.set(s"fs.azure.account.key.$blobAccount.blob.core.windows.net", blobKey)
> val path = new Path(s"wasbs://$blobContainer@$blobAccount.blob.core.windows.net/$blobFile")
> val fs = FileSystem.get(path, conf)
> fs.setXAttr(path, "somekey", "somevalue".getBytes)
> {code}
> Looking at the code in hadoop-tools/hadoop-azure, NativeAzureFileSystem inherits the default setXAttr from FileSystem which throws the UnsupportedOperationException.
> The underlying Azure Blob Storage service does support custom metadata ([service docs|https://docs.microsoft.com/en-us/azure/storage/blobs/storage-properties-metadata]) as does the azure-storage SDK that's being used by NativeAzureFileSystem ([SDK docs|http://javadox.com/com.microsoft.azure/azure-storage/2.0.0/com/microsoft/azure/storage/blob/CloudBlob.html#setMetadata(java.util.HashMap)]).
> Is there another way that I should be setting custom metadata on Azure Blob Storage files? Is there a specific reason why setXAttr hasn't been implemented on NativeAzureFileSystem? If not, I can take a shot at implementing it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org