You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by li...@apache.org on 2017/06/06 18:09:46 UTC

hadoop git commit: HADOOP-14491. Azure has messed doc structure. Contributed by Mingliang Liu

Repository: hadoop
Updated Branches:
  refs/heads/trunk 6b5285bbc -> 536f05715


HADOOP-14491. Azure has messed doc structure. Contributed by Mingliang Liu


Project: http://git-wip-us.apache.org/repos/asf/hadoop/repo
Commit: http://git-wip-us.apache.org/repos/asf/hadoop/commit/536f0571
Tree: http://git-wip-us.apache.org/repos/asf/hadoop/tree/536f0571
Diff: http://git-wip-us.apache.org/repos/asf/hadoop/diff/536f0571

Branch: refs/heads/trunk
Commit: 536f057158c445a57049f6c392869ae2f0be4f24
Parents: 6b5285b
Author: Mingliang Liu <li...@apache.org>
Authored: Mon Jun 5 18:24:40 2017 -0700
Committer: Mingliang Liu <li...@apache.org>
Committed: Tue Jun 6 11:09:28 2017 -0700

----------------------------------------------------------------------
 .../hadoop-azure/src/site/markdown/index.md     | 233 ++++++++++---------
 1 file changed, 123 insertions(+), 110 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/hadoop/blob/536f0571/hadoop-tools/hadoop-azure/src/site/markdown/index.md
----------------------------------------------------------------------
diff --git a/hadoop-tools/hadoop-azure/src/site/markdown/index.md b/hadoop-tools/hadoop-azure/src/site/markdown/index.md
index 1dca3b9..9c57e60 100644
--- a/hadoop-tools/hadoop-azure/src/site/markdown/index.md
+++ b/hadoop-tools/hadoop-azure/src/site/markdown/index.md
@@ -14,20 +14,9 @@
 
 # Hadoop Azure Support: Azure Blob Storage
 
-* [Introduction](#Introduction)
-* [Features](#Features)
-* [Limitations](#Limitations)
-* [Usage](#Usage)
-    * [Concepts](#Concepts)
-    * [Configuring Credentials](#Configuring_Credentials)
-    * [Page Blob Support and Configuration](#Page_Blob_Support_and_Configuration)
-    * [Atomic Folder Rename](#Atomic_Folder_Rename)
-    * [Accessing wasb URLs](#Accessing_wasb_URLs)
-    * [Append API Support and Configuration](#Append_API_Support_and_Configuration)
-    * [Multithread Support](#Multithread_Support)
-* [Testing the hadoop-azure Module](#Testing_the_hadoop-azure_Module)
-
-## <a name="Introduction" />Introduction
+<!-- MACRO{toc|fromDepth=1|toDepth=3} -->
+
+## Introduction
 
 The hadoop-azure module provides support for integration with
 [Azure Blob Storage](http://azure.microsoft.com/en-us/documentation/services/storage/).
@@ -38,7 +27,7 @@ on the additional artifacts it requires, notably the
 To make it part of Apache Hadoop's default classpath, simply make sure that
 HADOOP_OPTIONAL_TOOLS in hadoop-env.sh has 'hadoop-azure' in the list.
 
-## <a name="Features" />Features
+## Features
 
 * Read and write data stored in an Azure Blob Storage account.
 * Present a hierarchical file system view by implementing the standard Hadoop
@@ -54,15 +43,15 @@ HADOOP_OPTIONAL_TOOLS in hadoop-env.sh has 'hadoop-azure' in the list.
 * Tested on both Linux and Windows.
 * Tested at scale.
 
-## <a name="Limitations" />Limitations
+## Limitations
 
 * File owner and group are persisted, but the permissions model is not enforced.
   Authorization occurs at the level of the entire Azure Blob Storage account.
 * File last access time is not tracked.
 
-## <a name="Usage" />Usage
+## Usage
 
-### <a name="Concepts" />Concepts
+### Concepts
 
 The Azure Blob Storage data model presents 3 core concepts:
 
@@ -76,7 +65,7 @@ The Azure Blob Storage data model presents 3 core concepts:
   The internal implementation also uses blobs to persist the file system
   hierarchy and other metadata.
 
-### <a name="Configuring_Credentials" />Configuring Credentials
+### Configuring Credentials
 
 Usage of Azure Blob Storage requires configuration of credentials.  Typically
 this is set in core-site.xml.  The configuration property name is of the form
@@ -87,11 +76,12 @@ untrusted party.**
 
 For example:
 
-    <property>
-      <name>fs.azure.account.key.youraccount.blob.core.windows.net</name>
-      <value>YOUR ACCESS KEY</value>
-    </property>
-
+```xml
+<property>
+  <name>fs.azure.account.key.youraccount.blob.core.windows.net</name>
+  <value>YOUR ACCESS KEY</value>
+</property>
+```
 In many Hadoop clusters, the core-site.xml file is world-readable. It is possible to
 protect the access key within a credential provider as well. This provides an encrypted
 file format along with protection with file permissions.
@@ -110,14 +100,14 @@ For additional reading on the credential provider API see:
 
 ###### provision
 
-```
+```bash
 % hadoop credential create fs.azure.account.key.youraccount.blob.core.windows.net -value 123
     -provider localjceks://file/home/lmccay/wasb.jceks
 ```
 
 ###### configure core-site.xml or command line system property
 
-```
+```xml
 <property>
   <name>hadoop.security.credential.provider.path</name>
   <value>localjceks://file/home/lmccay/wasb.jceks</value>
@@ -127,7 +117,7 @@ For additional reading on the credential provider API see:
 
 ###### distcp
 
-```
+```bash
 % hadoop distcp
     [-D hadoop.security.credential.provider.path=localjceks://file/home/lmccay/wasb.jceks]
     hdfs://hostname:9001/user/lmccay/007020615 wasb://yourcontainer@youraccount.blob.core.windows.net/testDir/
@@ -145,22 +135,25 @@ specifies an external program to be invoked by Hadoop processes to decrypt the
 key.  The encrypted key value is passed to this external program as a command
 line argument:
 
-    <property>
-      <name>fs.azure.account.keyprovider.youraccount</name>
-      <value>org.apache.hadoop.fs.azure.ShellDecryptionKeyProvider</value>
-    </property>
+```xml
+<property>
+  <name>fs.azure.account.keyprovider.youraccount</name>
+  <value>org.apache.hadoop.fs.azure.ShellDecryptionKeyProvider</value>
+</property>
 
-    <property>
-      <name>fs.azure.account.key.youraccount.blob.core.windows.net</name>
-      <value>YOUR ENCRYPTED ACCESS KEY</value>
-    </property>
+<property>
+  <name>fs.azure.account.key.youraccount.blob.core.windows.net</name>
+  <value>YOUR ENCRYPTED ACCESS KEY</value>
+</property>
 
-    <property>
-      <name>fs.azure.shellkeyprovider.script</name>
-      <value>PATH TO DECRYPTION PROGRAM</value>
-    </property>
+<property>
+  <name>fs.azure.shellkeyprovider.script</name>
+  <value>PATH TO DECRYPTION PROGRAM</value>
+</property>
+
+```
 
-### <a name="Page_Blob_Support_and_Configuration" />Page Blob Support and Configuration
+### Page Blob Support and Configuration
 
 The Azure Blob Storage interface for Hadoop supports two kinds of blobs,
 [block blobs and page blobs](http://msdn.microsoft.com/en-us/library/azure/ee691964.aspx).
@@ -182,10 +175,12 @@ folder names.
 
 For example:
 
-    <property>
-      <name>fs.azure.page.blob.dir</name>
-      <value>/hbase/WALs,/hbase/oldWALs,/data/mypageblobfiles</value>
-    </property>
+```xml
+<property>
+  <name>fs.azure.page.blob.dir</name>
+  <value>/hbase/WALs,/hbase/oldWALs,/data/mypageblobfiles</value>
+</property>
+```
 
 You can set this to simply / to make all files page blobs.
 
@@ -197,7 +192,7 @@ The configuration option `fs.azure.page.blob.extension.size` is the page blob
 extension size.  This defines the amount to extend a page blob if it starts to
 get full.  It must be 128MB or greater, specified as an integer number of bytes.
 
-### <a name="Atomic_Folder_Rename" />Atomic Folder Rename
+### Atomic Folder Rename
 
 Azure storage stores files as a flat key/value store without formal support
 for folders.  The hadoop-azure file system layer simulates folders on top
@@ -216,12 +211,14 @@ the intention of the rename operation, to allow redo in event of a failure.
 
 For example:
 
-    <property>
-      <name>fs.azure.atomic.rename.dir</name>
-      <value>/hbase,/data</value>
-    </property>
+```xml
+<property>
+  <name>fs.azure.atomic.rename.dir</name>
+  <value>/hbase,/data</value>
+</property>
+```
 
-### <a name="Accessing_wasb_URLs" />Accessing wasb URLs
+### Accessing wasb URLs
 
 After credentials are configured in core-site.xml, any Hadoop component may
 reference files in that Azure Blob Storage account by using URLs of the following
@@ -238,28 +235,32 @@ For example, the following
 commands demonstrate access to a storage account named `youraccount` and a
 container named `yourcontainer`.
 
-    > hadoop fs -mkdir wasb://yourcontainer@youraccount.blob.core.windows.net/testDir
+```bash
+% hadoop fs -mkdir wasb://yourcontainer@youraccount.blob.core.windows.net/testDir
 
-    > hadoop fs -put testFile wasb://yourcontainer@youraccount.blob.core.windows.net/testDir/testFile
+% hadoop fs -put testFile wasb://yourcontainer@youraccount.blob.core.windows.net/testDir/testFile
 
-    > hadoop fs -cat wasbs://yourcontainer@youraccount.blob.core.windows.net/testDir/testFile
-    test file content
+% hadoop fs -cat wasbs://yourcontainer@youraccount.blob.core.windows.net/testDir/testFile
+test file content
+```
 
 It's also possible to configure `fs.defaultFS` to use a `wasb` or `wasbs` URL.
 This causes all bare paths, such as `/testDir/testFile` to resolve automatically
 to that file system.
 
-### <a name="Append_API_Support_and_Configuration" />Append API Support and Configuration
+### Append API Support and Configuration
 
 The Azure Blob Storage interface for Hadoop has optional support for Append API for
 single writer by setting the configuration `fs.azure.enable.append.support` to true.
 
 For Example:
 
-    <property>
-      <name>fs.azure.enable.append.support</name>
-      <value>true</value>
-    </property>
+```xml
+<property>
+  <name>fs.azure.enable.append.support</name>
+  <value>true</value>
+</property>
+```
 
 It must be noted Append support in Azure Blob Storage interface DIFFERS FROM HDFS SEMANTICS. Append
 support does not enforce single writer internally but requires applications to guarantee this semantic.
@@ -267,25 +268,29 @@ It becomes a responsibility of the application either to ensure single-threaded
 file path, or rely on some external locking mechanism of its own.  Failure to do so will result in
 unexpected behavior.
 
-### <a name="Multithread_Support" />Multithread Support
+### Multithread Support
 
 Rename and Delete blob operations on directories with large number of files and sub directories currently is very slow as these operations are done one blob at a time serially. These files and sub folders can be deleted or renamed parallel. Following configurations can be used to enable threads to do parallel processing
 
 To enable 10 threads for Delete operation. Set configuration value to 0 or 1 to disable threads. The default behavior is threads disabled.
 
-    <property>
-      <name>fs.azure.delete.threads</name>
-      <value>10</value>
-    </property>
+```xml
+<property>
+  <name>fs.azure.delete.threads</name>
+  <value>10</value>
+</property>
+```
 
 To enable 20 threads for Rename operation. Set configuration value to 0 or 1 to disable threads. The default behavior is threads disabled.
 
-    <property>
-      <name>fs.azure.rename.threads</name>
-      <value>20</value>
-    </property>
+```xml
+<property>
+  <name>fs.azure.rename.threads</name>
+  <value>20</value>
+</property>
+```
 
-### <a name="WASB_SECURE_MODE" />WASB Secure mode and configuration
+### WASB Secure mode and configuration
 
 WASB can operate in secure mode where the Storage access keys required to communicate with Azure storage does not have to
 be in the same address space as the process using WASB. In this mode all interactions with Azure storage is performed using
@@ -295,30 +300,32 @@ Romote mode, however for testing purposes the local mode can be enabled to gener
 
 To enable Secure mode following property needs to be set to true.
 
-```
-    <property>
-      <name>fs.azure.secure.mode</name>
-      <value>true</value>
-    </property>
+```xml
+<property>
+  <name>fs.azure.secure.mode</name>
+  <value>true</value>
+</property>
 ```
 
 To enable SAS key generation locally following property needs to be set to true.
 
+```xml
+<property>
+  <name>fs.azure.local.sas.key.mode</name>
+  <value>true</value>
+</property>
 ```
-    <property>
-      <name>fs.azure.local.sas.key.mode</name>
-      <value>true</value>
-    </property>
-```
+
 To use the remote SAS key generation mode, an external REST service is expected to provided required SAS keys.
 Following property can used to provide the end point to use for remote SAS Key generation:
 
+```xml
+<property>
+  <name>fs.azure.cred.service.url</name>
+  <value>{URL}</value>
+</property>
 ```
-    <property>
-      <name>fs.azure.cred.service.url</name>
-      <value>{URL}</value>
-    </property>
-```
+
 The remote service is expected to provide support for two REST calls ```{URL}/GET_CONTAINER_SAS``` and ```{URL}/GET_RELATIVE_BLOB_SAS```, for generating
 container and relative blob sas keys. An example requests
 
@@ -326,7 +333,8 @@ container and relative blob sas keys. An example requests
 ```{URL}/GET_CONTAINER_SAS?storage_account=<account_name>&container=<container>&relative_path=<relative path>&sas_expiry=<expiry period>&delegation_token=<delegation token>```
 
 The service is expected to return a response in JSON format:
-```
+
+```json
 {
   "responseCode" : 0 or non-zero <int>,
   "responseMessage" : relavant message on failure <String>,
@@ -334,40 +342,42 @@ The service is expected to return a response in JSON format:
 }
 ```
 
-## <a name="WASB Authorization" />Authorization Support in WASB.
+### Authorization Support in WASB
 
 Authorization support can be enabled in WASB using the following configuration:
 
+```xml
+<property>
+  <name>fs.azure.authorization</name>
+  <value>true</value>
+</property>
 ```
-    <property>
-      <name>fs.azure.authorization</name>
-      <value>true</value>
-    </property>
-```
-  The current implementation of authorization relies on the presence of an external service that can enforce
-  the authorization. The service is expected to be running on a URL provided by the following config.
 
-```
-    <property>
-      <name>fs.azure.authorization.remote.service.url</name>
-      <value>{URL}</value>
-    </property>
+The current implementation of authorization relies on the presence of an external service that can enforce
+the authorization. The service is expected to be running on a URL provided by the following config.
+
+```xml
+<property>
+  <name>fs.azure.authorization.remote.service.url</name>
+  <value>{URL}</value>
+</property>
 ```
 
-  The remote service is expected to provide support for the following REST call: ```{URL}/CHECK_AUTHORIZATION```
-  An example request:
+The remote service is expected to provide support for the following REST call: ```{URL}/CHECK_AUTHORIZATION```
+An example request:
   ```{URL}/CHECK_AUTHORIZATION?wasb_absolute_path=<absolute_path>&operation_type=<operation type>&delegation_token=<delegation token>```
 
-  The service is expected to return a response in JSON format:
-  ```
-  {
+The service is expected to return a response in JSON format:
+
+```json
+{
     "responseCode" : 0 or non-zero <int>,
     "responseMessage" : relevant message on failure <String>,
     "authorizationResult" : true/false <boolean>
-  }
-  ```
+}
+```
 
-## <a name="Testing_the_hadoop-azure_Module" />Testing the hadoop-azure Module
+## Testing the hadoop-azure Module
 
 The hadoop-azure module includes a full suite of unit tests.  Most of the tests
 will run without additional configuration by running `mvn test`.  This includes
@@ -382,10 +392,12 @@ that runs on a local machine.
 To use the emulator, install Azure SDK 2.3 and start the storage emulator.  Then,
 edit `src/test/resources/azure-test.xml` and add the following property:
 
-    <property>
-      <name>fs.azure.test.emulator</name>
-      <value>true</value>
-    </property>
+```xml
+<property>
+  <name>fs.azure.test.emulator</name>
+  <value>true</value>
+</property>
+```
 
 There is a known issue when running tests with the emulator.  You may see the
 following failure message:
@@ -399,6 +411,7 @@ file to `src/test/resources/azure-auth-keys.xml` and setting
 the name of the storage account and its access key.
 
 For example:
+
 ```xml
 <?xml version="1.0"?>
 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>


---------------------------------------------------------------------
To unsubscribe, e-mail: common-commits-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-commits-help@hadoop.apache.org