You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by GitBox <gi...@apache.org> on 2019/10/16 18:22:27 UTC

[GitHub] [accumulo-website] karthick-rn commented on a change in pull request #198: Blog post to configure Accumulo with Azure Data Lake Gen2 Storage

karthick-rn commented on a change in pull request #198: Blog post to configure Accumulo with Azure Data Lake Gen2 Storage
URL: https://github.com/apache/accumulo-website/pull/198#discussion_r335637632
 
 

 ##########
 File path: _posts/blog/2019-10-15-accumulo-adlsgen2-notes.md
 ##########
 @@ -0,0 +1,132 @@
+---
+title: "Using ADLS Gen2 as a data store for Accumulo"
+author: Karthick Narendran
+---
+
+Accumulo can store its files in [Azure Data Lake Storage Gen2](https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-introduction)
+using the [ABFS (Azure Blob File System)](https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-abfs-driver) driver.
+Similar to [S3 blog](https://accumulo.apache.org/blog/2019/09/10/accumulo-S3-notes.html), 
+the write ahead logs & Accumulo metadata can be stored in HDFS and everything else on Gen2 storage
+using the volume chooser feature introduced in Accumulo 2.0. The configurations referred on this blog
+are specific to Accumulo 2.0 and Hadoop 3.2.0.
+
+## Hadoop setup
+
+For ABFS client to talk to Gen2 storage, it requires one of the Authentication mechanism listed [here](https://hadoop.apache.org/docs/current/hadoop-azure/abfs.html#Authentication)
+This post covers [Azure Managed Identity](https://docs.microsoft.com/en-us/azure/active-directory/managed-identities-azure-resources/overview)
+formerly known as Managed Service Identity or MSI. This feature provides Azure services with an 
+automatically managed identity in [Azure AD](https://docs.microsoft.com/en-us/azure/active-directory/fundamentals/active-directory-whatis)
+and it avoids the need for credentials or other sensitive information from being stored in code 
+or configs/JCEKS. Plus, it comes free with Azure AD.  
+
+At least the following settings should be added to Hadoop's `core-site.xml` file on each node in the cluster. 
+
+```xml
+<property>
+  <name>fs.azure.account.auth.type</name>
+  <value>OAuth</value>
+</property>
+<property>
+  <name>fs.azure.account.oauth.provider.type</name>
+  <value>org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider</value>
+</property>
+<property>
+  <name>fs.azure.account.oauth2.msi.tenant</name>
+  <value>TenantID</value>
+</property>
+<property>
+  <name>fs.azure.account.oauth2.client.id</name>
+  <value>ClientID</value>
+</property>
+```
+ 
+See [ABFS doc](https://hadoop.apache.org/docs/current/hadoop-azure/abfs.html)
+for more information on Hadoop Azure support.
+
+To get hadoop command to work with ADLS Gen2 set the 
+following entries in `hadoop-env.sh`. As Gen2 storage is TLS enabled by default, 
+it is important we use the native OpenSSL implementation of TLS.
+
+```bash
+export HADOOP_OPTIONAL_TOOLS="hadoop-azure"
+export HADOOP_OPTS="-Dorg.wildfly.openssl.path=<path/to/OpenSSL/libraries> ${HADOOP_OPTS}"
 
 Review comment:
   In `accumulo-env.sh`, I have already included the openssl jars to the class path and it gets initialised successfully during starting Accumulo. 
   
   ```bash
   CLASSPATH="${CLASSPATH}:${HADOOP_HOME}/share/hadoop/tools/lib/wildfly-openssl-1.0.4.Final.jar"
   ```
   Trying to add `-Dorg.wildfly.openssl.path=<path/to/OpenSSL/libraries>` to `JAVA_OPTS` fails to load OpenSSL as shown below.
   
   ```bash
   2019-10-16 18:02:05,445 [utils.SSLSocketFactoryEx] WARN : Failed to load OpenSSL. Falling back to the JSSE default.
   ```
   
   
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services