You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@accumulo.apache.org by br...@apache.org on 2022/04/21 13:01:49 UTC
[accumulo-testing] branch main updated: Managed disk support for Azure terraform infra (#202)
This is an automated email from the ASF dual-hosted git repository.
brianloss pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/accumulo-testing.git
The following commit(s) were added to refs/heads/main by this push:
new e9cbb85 Managed disk support for Azure terraform infra (#202)
e9cbb85 is described below
commit e9cbb856a0fac3335fff13fc175d7439c0940cca
Author: Brian Loss <br...@gmail.com>
AuthorDate: Thu Apr 21 09:01:44 2022 -0400
Managed disk support for Azure terraform infra (#202)
Support adding managed disk to the Azure VMs created by the terraform
testing infrastructure. By adding multiple managed disks to a VM, we can
get significantly more space for data storage and also increase
performance since the data is striped across multiple disks.
* Modify the cloud-init module to accept an argument indicating the type
of deployment (AWS or Azure) so that conditional blocks can be
included in the cloud-init script.
* cloud-init module now accepts an optional lvm_mount_point argument. If
this argument is specified, then the cloud-init script will assume
that managed disks were created and load a script on the VM and run it
to wait for the disks to be attached, then group them in an LVM volume
that is mounted under the specified mount point.
* The azure main.tf file accepts a new managed_disk_configuration
optional argument that contains the LVM mount point, and the number,
size, and sku of managed disks to add to each VM. If this argument is
specified, then the managed disks are created and attached to the VMs,
and the lvm mount point and expected number of disks are passed along
to the cloud-init module. Due to the way attaching managed disks are
supported by Terraform (they must be attached after the VM is created,
although Azure does not have this restriction), the provisioner script
that waits for cloud-init to complete had to be moved outside of the
VM creation to a null_resource. This null_resource must then be
explicitly added as a dependency of any module that requires the
manager or worker VMs to be created AND have cloud-init completed
running.
* Fix bug in Azure configuration where the script would fail if the
create_resource_group variable was set to false (indicating that an
existing resource group should be used instead of creating a new one).
* Update the maven version to 3.8.5.
---
contrib/terraform-testing-infrastructure/README.md | 5 +-
.../terraform-testing-infrastructure/aws/main.tf | 1 +
.../aws/variables.tf | 2 +-
.../terraform-testing-infrastructure/azure/main.tf | 141 +++++++++++++++++----
.../azure/variables.tf | 31 ++++-
.../files/azure-format-lvm-data-disk.sh | 52 ++++++++
.../modules/cloud-init-config/main.tf | 23 +++-
.../cloud-init-config/templates/cloud-init.tftpl | 9 ++
.../modules/config-files/templates/zoo.cfg.tftpl | 2 +-
9 files changed, 237 insertions(+), 29 deletions(-)
diff --git a/contrib/terraform-testing-infrastructure/README.md b/contrib/terraform-testing-infrastructure/README.md
index a4c3fea..9c99883 100644
--- a/contrib/terraform-testing-infrastructure/README.md
+++ b/contrib/terraform-testing-infrastructure/README.md
@@ -165,7 +165,7 @@ The table below lists the variables and their default values that are used in th
| instance\_count | The number of EC2 instances to create | `string` | `"2"` | no |
| instance\_type | The type of EC2 instances to create | `string` | `"m5.2xlarge"` | no |
| local\_sources\_dir | Directory on local machine that contains Maven, ZooKeeper or Hadoop binary distributions or Accumulo source tarball | `string` | `""` | no |
-| maven\_version | The version of Maven to download and install | `string` | `"3.8.4"` | no |
+| maven\_version | The version of Maven to download and install | `string` | `"3.8.5"` | no |
| optional\_cloudinit\_config | An optional config block for the cloud-init script. If you set this, you should consider setting cloudinit\_merge\_type to handle merging with the default script as you need. | `string` | `null` | no |
| private\_network | Indicates whether or not the user is on a private network and access to hosts should be through the private IP addresses rather than public ones. | `bool` | `false` | no |
| root\_volume\_gb | The size, in GB, of the EC2 instance root volume | `string` | `"300"` | no |
@@ -208,7 +208,8 @@ The table below lists the variables and their default values that are used in th
| hadoop\_version | The version of Hadoop to download and install | `string` | `"3.3.1"` | no |
| local\_sources\_dir | Directory on local machine that contains Maven, ZooKeeper or Hadoop binary distributions or Accumulo source tarball | `string` | `""` | no |
| location | The Azure region where resources are to be created. If an existing resource group is specified, this value is ignored and the resource group's location is used. | `string` | n/a | yes |
-| maven\_version | The version of Maven to download and install | `string` | `"3.8.4"` | no |
+| managed\_disk\_configuration | Optional managed disk configuration. If supplied, the managed disks on each VM will be combined into an LVM volume mounted at the named mount point. | <pre>object({<br> mount_point = string<br> disk_count = number<br> storage_account_type = string<br> disk_size_gb = number<br> })</pre> | `null` | no |
+| maven\_version | The version of Maven to download and install | `string` | `"3.8.5"` | no |
| network\_address\_space | The network address space to use for the virtual network. | `list(string)` | <pre>[<br> "10.0.0.0/16"<br>]</pre> | no |
| optional\_cloudinit\_config | An optional config block for the cloud-init script. If you set this, you should consider setting cloudinit\_merge\_type to handle merging with the default script as you need. | `string` | `null` | no |
| os\_disk\_caching | The type of caching to use for the OS disk. Possible values are None, ReadOnly, and ReadWrite. | `string` | `"ReadOnly"` | no |
diff --git a/contrib/terraform-testing-infrastructure/aws/main.tf b/contrib/terraform-testing-infrastructure/aws/main.tf
index f4444f3..59a855a 100644
--- a/contrib/terraform-testing-infrastructure/aws/main.tf
+++ b/contrib/terraform-testing-infrastructure/aws/main.tf
@@ -131,6 +131,7 @@ module "cloud_init_config" {
accumulo_branch_name = var.accumulo_branch_name
accumulo_version = var.accumulo_version
authorized_ssh_keys = local.ssh_keys[*]
+ cluster_type = "aws"
optional_cloudinit_config = var.optional_cloudinit_config
cloudinit_merge_type = var.cloudinit_merge_type
diff --git a/contrib/terraform-testing-infrastructure/aws/variables.tf b/contrib/terraform-testing-infrastructure/aws/variables.tf
index b19c2a5..5d7f0e4 100644
--- a/contrib/terraform-testing-infrastructure/aws/variables.tf
+++ b/contrib/terraform-testing-infrastructure/aws/variables.tf
@@ -129,7 +129,7 @@ variable "accumulo_dir" {
}
variable "maven_version" {
- default = "3.8.4"
+ default = "3.8.5"
description = "The version of Maven to download and install"
nullable = false
}
diff --git a/contrib/terraform-testing-infrastructure/azure/main.tf b/contrib/terraform-testing-infrastructure/azure/main.tf
index 4e11749..82bf818 100644
--- a/contrib/terraform-testing-infrastructure/azure/main.tf
+++ b/contrib/terraform-testing-infrastructure/azure/main.tf
@@ -69,6 +69,15 @@ locals {
ssh_keys = toset(concat(var.authorized_ssh_keys, [for k in var.authorized_ssh_key_files : file(k)]))
+ # Resource group name and location
+ # This is pulled either from the resource group that was created (if create_resource_group is true)
+ # or from the resource group that already exists (if create_resource_group is false). Keeping
+ # references to the resource group or data object rather than just using var.resource_group_name
+ # allows for terraform to automatically create the dependency graph and wait for the resource group
+ # to be created if necessary.
+ rg_name = var.create_resource_group ? azurerm_resource_group.rg[0].name : data.azurerm_resource_group.existing_rg[0].name
+ location = var.create_resource_group ? azurerm_resource_group.rg[0].location : data.azurerm_resource_group.existing_rg[0].location
+
# Save the public/private IP addresses of the VMs to pass to sub-modules.
manager_ip = azurerm_linux_virtual_machine.manager.public_ip_address
worker_ips = azurerm_linux_virtual_machine.workers[*].public_ip_address
@@ -84,6 +93,11 @@ locals {
]
}
+data "azurerm_resource_group" "existing_rg" {
+ count = var.create_resource_group ? 0 : 1
+ name = var.resource_group_name
+}
+
# Place all resources in a resource group
resource "azurerm_resource_group" "rg" {
count = var.create_resource_group ? 1 : 0
@@ -98,8 +112,8 @@ resource "azurerm_resource_group" "rg" {
# Creates a virtual network for use by this cluster.
resource "azurerm_virtual_network" "accumulo_vnet" {
name = "${var.resource_name_prefix}-vnet"
- resource_group_name = azurerm_resource_group.rg[0].name
- location = azurerm_resource_group.rg[0].location
+ resource_group_name = local.rg_name
+ location = local.location
address_space = var.network_address_space
}
@@ -107,7 +121,7 @@ resource "azurerm_virtual_network" "accumulo_vnet" {
# so that we'll be able to create an NFS share.
resource "azurerm_subnet" "internal" {
name = "${var.resource_name_prefix}-subnet"
- resource_group_name = azurerm_resource_group.rg[0].name
+ resource_group_name = local.rg_name
virtual_network_name = azurerm_virtual_network.accumulo_vnet.name
address_prefixes = var.subnet_address_prefixes
}
@@ -116,8 +130,8 @@ resource "azurerm_subnet" "internal" {
# traffic from the internet and denies everything else.
resource "azurerm_network_security_group" "nsg" {
name = "${var.resource_name_prefix}-nsg"
- location = azurerm_resource_group.rg[0].location
- resource_group_name = azurerm_resource_group.rg[0].name
+ location = local.location
+ resource_group_name = local.rg_name
security_rule {
name = "allow-ssh"
@@ -140,6 +154,8 @@ resource "azurerm_network_security_group" "nsg" {
module "cloud_init_config" {
source = "../modules/cloud-init-config"
+ lvm_mount_point = var.managed_disk_configuration != null ? var.managed_disk_configuration.mount_point : null
+ lvm_disk_count = var.managed_disk_configuration != null ? var.managed_disk_configuration.disk_count : null
software_root = var.software_root
zookeeper_dir = var.zookeeper_dir
hadoop_dir = var.hadoop_dir
@@ -151,6 +167,7 @@ module "cloud_init_config" {
accumulo_version = var.accumulo_version
authorized_ssh_keys = local.ssh_keys[*]
os_type = local.os_type
+ cluster_type = "azure"
optional_cloudinit_config = var.optional_cloudinit_config
cloudinit_merge_type = var.cloudinit_merge_type
@@ -159,16 +176,16 @@ module "cloud_init_config" {
# Create a static public IP address for the manager node.
resource "azurerm_public_ip" "manager" {
name = "${var.resource_name_prefix}-manager-ip"
- resource_group_name = azurerm_resource_group.rg[0].name
- location = azurerm_resource_group.rg[0].location
+ resource_group_name = local.rg_name
+ location = local.location
allocation_method = "Static"
}
# Create a NIC for the manager node.
resource "azurerm_network_interface" "manager" {
name = "${var.resource_name_prefix}-manager-nic"
- location = azurerm_resource_group.rg[0].location
- resource_group_name = azurerm_resource_group.rg[0].name
+ location = local.location
+ resource_group_name = local.rg_name
enable_accelerated_networking = true
@@ -190,8 +207,8 @@ resource "azurerm_network_interface_security_group_association" "manager" {
resource "azurerm_public_ip" "workers" {
count = var.worker_count
name = "${var.resource_name_prefix}-worker${count.index}-ip"
- resource_group_name = azurerm_resource_group.rg[0].name
- location = azurerm_resource_group.rg[0].location
+ resource_group_name = local.rg_name
+ location = local.location
allocation_method = "Static"
}
@@ -199,8 +216,8 @@ resource "azurerm_public_ip" "workers" {
resource "azurerm_network_interface" "workers" {
count = var.worker_count
name = "${var.resource_name_prefix}-worker${count.index}-nic"
- location = azurerm_resource_group.rg[0].location
- resource_group_name = azurerm_resource_group.rg[0].name
+ location = local.location
+ resource_group_name = local.rg_name
enable_accelerated_networking = true
@@ -223,8 +240,8 @@ resource "azurerm_network_interface_security_group_association" "workers" {
# Add a login user that can SSH to the VM using the first supplied SSH key.
resource "azurerm_linux_virtual_machine" "manager" {
name = "${var.resource_name_prefix}-manager"
- resource_group_name = azurerm_resource_group.rg[0].name
- location = azurerm_resource_group.rg[0].location
+ resource_group_name = local.rg_name
+ location = local.location
size = var.vm_sku
computer_name = "manager"
admin_username = var.admin_username
@@ -256,15 +273,44 @@ resource "azurerm_linux_virtual_machine" "manager" {
sku = var.vm_image.sku
version = var.vm_image.version
}
+}
+
+# Create and attach managed disks to the manager VM.
+resource "azurerm_managed_disk" "manager_managed_disk" {
+ count = var.managed_disk_configuration != null ? var.managed_disk_configuration.disk_count : 0
+ name = format("%s_disk%02d", azurerm_linux_virtual_machine.manager.name, count.index)
+ resource_group_name = local.rg_name
+ location = local.location
+ storage_account_type = var.managed_disk_configuration.storage_account_type
+ disk_size_gb = var.managed_disk_configuration.disk_size_gb
+ create_option = "Empty"
+}
+
+resource "azurerm_virtual_machine_data_disk_attachment" "manager_managed_disk_attachment" {
+ count = var.managed_disk_configuration != null ? var.managed_disk_configuration.disk_count : 0
+ managed_disk_id = azurerm_managed_disk.manager_managed_disk[count.index].id
+ virtual_machine_id = azurerm_linux_virtual_machine.manager.id
+ lun = 10 + count.index
+ caching = "ReadOnly"
+}
+# Wait for cloud-init to complete on the manager VM.
+# This is done here rather than in the VM resource because the cloud-init script
+# waits for managed disks to be attached (if used), but the managed disks cannot
+# be attached until the VM is created, so we'd have a deadlock.
+resource "null_resource" "wait_for_manager_cloud_init" {
provisioner "remote-exec" {
inline = local.ready_script
connection {
type = "ssh"
- user = self.admin_username
- host = self.public_ip_address
+ user = azurerm_linux_virtual_machine.manager.admin_username
+ host = azurerm_linux_virtual_machine.manager.public_ip_address
}
}
+
+ depends_on = [
+ azurerm_virtual_machine_data_disk_attachment.manager_managed_disk_attachment
+ ]
}
# Create the worker VMs.
@@ -272,8 +318,8 @@ resource "azurerm_linux_virtual_machine" "manager" {
resource "azurerm_linux_virtual_machine" "workers" {
count = var.worker_count
name = "${var.resource_name_prefix}-worker${count.index}"
- resource_group_name = azurerm_resource_group.rg[0].name
- location = azurerm_resource_group.rg[0].location
+ resource_group_name = local.rg_name
+ location = local.location
size = var.vm_sku
computer_name = "worker${count.index}"
admin_username = var.admin_username
@@ -305,15 +351,57 @@ resource "azurerm_linux_virtual_machine" "workers" {
sku = var.vm_image.sku
version = var.vm_image.version
}
+}
+
+# Create and attach managed disks to the worker VMs.
+locals {
+ worker_disks = var.managed_disk_configuration == null ? [] : flatten([
+ for vm_num, vm in azurerm_linux_virtual_machine.workers : [
+ for disk_num in range(var.managed_disk_configuration.disk_count) : {
+ datadisk_name = format("%s_disk%02d", vm.name, disk_num)
+ lun = 10 + disk_num
+ worker_num = vm_num
+ }
+ ]
+ ])
+}
+
+resource "azurerm_managed_disk" "worker_managed_disk" {
+ count = length(local.worker_disks)
+ name = local.worker_disks[count.index].datadisk_name
+ resource_group_name = local.rg_name
+ location = local.location
+ storage_account_type = var.managed_disk_configuration.storage_account_type
+ disk_size_gb = var.managed_disk_configuration.disk_size_gb
+ create_option = "Empty"
+}
+resource "azurerm_virtual_machine_data_disk_attachment" "worker_managed_disk_attachment" {
+ count = length(local.worker_disks)
+ managed_disk_id = azurerm_managed_disk.worker_managed_disk[count.index].id
+ virtual_machine_id = azurerm_linux_virtual_machine.workers[local.worker_disks[count.index].worker_num].id
+ lun = local.worker_disks[count.index].lun
+ caching = "ReadOnly"
+}
+
+# Wait for cloud-init to complete on the worker VMs.
+# This is done here rather than in the VM resources because the cloud-init script
+# waits for managed disks to be attached (if used), but the managed disks cannot
+# be attached until the VMs are created, so we'd have a deadlock.
+resource "null_resource" "wait_for_workers_cloud_init" {
+ count = length(azurerm_linux_virtual_machine.workers)
provisioner "remote-exec" {
inline = local.ready_script
connection {
type = "ssh"
- user = self.admin_username
- host = self.public_ip_address
+ user = azurerm_linux_virtual_machine.workers[count.index].admin_username
+ host = azurerm_linux_virtual_machine.workers[count.index].public_ip_address
}
}
+
+ depends_on = [
+ azurerm_virtual_machine_data_disk_attachment.worker_managed_disk_attachment
+ ]
}
##############################
@@ -351,6 +439,10 @@ module "config_files" {
accumulo_instance_name = var.accumulo_instance_name
accumulo_root_password = var.accumulo_root_password
+
+ depends_on = [
+ null_resource.wait_for_manager_cloud_init
+ ]
}
#
@@ -363,6 +455,10 @@ module "upload_software" {
local_sources_dir = var.local_sources_dir
upload_dir = var.software_root
upload_host = local.manager_ip
+
+ depends_on = [
+ null_resource.wait_for_manager_cloud_init
+ ]
}
#
@@ -379,7 +475,8 @@ module "configure_nodes" {
depends_on = [
module.upload_software,
- module.config_files
+ module.config_files,
+ null_resource.wait_for_workers_cloud_init
]
}
diff --git a/contrib/terraform-testing-infrastructure/azure/variables.tf b/contrib/terraform-testing-infrastructure/azure/variables.tf
index 9edf84b..2a0e8bb 100644
--- a/contrib/terraform-testing-infrastructure/azure/variables.tf
+++ b/contrib/terraform-testing-infrastructure/azure/variables.tf
@@ -126,6 +126,35 @@ variable "os_disk_caching" {
}
}
+variable "managed_disk_configuration" {
+ default = null
+ type = object({
+ mount_point = string
+ disk_count = number
+ storage_account_type = string
+ disk_size_gb = number
+ })
+ description = "Optional managed disk configuration. If supplied, the managed disks on each VM will be combined into an LVM volume mounted at the named mount point."
+ nullable = true
+
+ validation {
+ condition = var.managed_disk_configuration.mount_point != null
+ error_message = "The mount point must be specified."
+ }
+ validation {
+ condition = var.managed_disk_configuration.disk_count > 0
+ error_message = "The number of disks must be at least 1."
+ }
+ validation {
+ condition = contains(["Standard_LRS", "StandardSSD_LRS", "Premium_LRS"], var.managed_disk_configuration.storage_account_type)
+ error_message = "The storage account type must be one of 'Standard_LRS', 'StandardSSD_LRS', or 'Premium_LRS'."
+ }
+ validation {
+ condition = var.managed_disk_configuration.disk_size_gb > 0 && var.managed_disk_configuration.disk_size_gb <= 32767
+ error_message = "The disk size must be at least 1GB and less than 32768GB."
+ }
+}
+
variable "software_root" {
default = "/opt/accumulo-testing"
description = "The full directory root where software will be installed"
@@ -178,7 +207,7 @@ variable "accumulo_dir" {
}
variable "maven_version" {
- default = "3.8.4"
+ default = "3.8.5"
description = "The version of Maven to download and install"
nullable = false
}
diff --git a/contrib/terraform-testing-infrastructure/modules/cloud-init-config/files/azure-format-lvm-data-disk.sh b/contrib/terraform-testing-infrastructure/modules/cloud-init-config/files/azure-format-lvm-data-disk.sh
new file mode 100644
index 0000000..4a94412
--- /dev/null
+++ b/contrib/terraform-testing-infrastructure/modules/cloud-init-config/files/azure-format-lvm-data-disk.sh
@@ -0,0 +1,52 @@
+#!/usr/bin/env bash
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+[ $# -eq 3 ] || { echo "usage: $0 disk_count mount_point user.group"; exit 1; }
+
+diskCount=$1
+mountPoint=$2
+owner=$3
+
+until [[ $(ls -1 /dev/disk/azure/scsi1/ | wc -l) == "$diskCount" ]]; do
+ echo "Waiting for $diskCount disks to be attached..."
+ sleep 10
+done
+
+VG_GROUP_NAME=storage_vg
+LG_GROUP_NAME=storage_lv
+
+DISK_PATH="/dev/disk/azure/scsi1"
+declare -a REAL_PATH_ARR
+
+for i in $(ls ${DISK_PATH} 2>/dev/null);
+do
+ REAL_PATH=`realpath ${DISK_PATH}/${i}|tr '\n' ' ' `
+ REAL_PATH_ARR+=($REAL_PATH)
+done;
+
+
+RAID_DEVICE_LIST=`echo "${REAL_PATH_ARR[@]}"|sort`
+RAID_DEVICES_COUNT=`echo "${#REAL_PATH_ARR[@]}"`
+pvcreate ${RAID_DEVICE_LIST}
+vgcreate -s 4M ${VG_GROUP_NAME} ${RAID_DEVICE_LIST}
+lvcreate -n $LG_GROUP_NAME -l 100%FREE -i ${RAID_DEVICES_COUNT} ${VG_GROUP_NAME}
+mkfs.xfs -K -f /dev/${VG_GROUP_NAME}/${LG_GROUP_NAME}
+mkdir -p ${mountPoint}
+printf "/dev/${VG_GROUP_NAME}/${LG_GROUP_NAME}\t${mountPoint}\tauto\tdefaults,noatime\t0\t2\n" >> /etc/fstab
+mount --target ${mountPoint}
+chown ${owner} ${mountPoint}
diff --git a/contrib/terraform-testing-infrastructure/modules/cloud-init-config/main.tf b/contrib/terraform-testing-infrastructure/modules/cloud-init-config/main.tf
index 1de3a15..2fa1299 100644
--- a/contrib/terraform-testing-infrastructure/modules/cloud-init-config/main.tf
+++ b/contrib/terraform-testing-infrastructure/modules/cloud-init-config/main.tf
@@ -25,6 +25,16 @@ variable "hadoop_version" {}
variable "accumulo_branch_name" {}
variable "accumulo_version" {}
variable "authorized_ssh_keys" {}
+variable "lvm_mount_point" {
+ default = null
+ description = "Mount point for the LVM volume containing managed disks. If not specified, then no LVM volume is created."
+ nullable = true
+}
+variable "lvm_disk_count" {
+ default = null
+ description = "Number of disks to be combined in an LVM volume. If lvm_mount_point is not specified, this is not used."
+ nullable = true
+}
variable "cloudinit_merge_type" {
default = "dict(recurse_array,no_replace)+list(append)"
nullable = false
@@ -42,6 +52,14 @@ variable "os_type" {
error_message = "The value of os_type must be either 'centos' or 'ubuntu'."
}
}
+variable "cluster_type" {
+ type = string
+ nullable = false
+ validation {
+ condition = contains(["aws", "azure"], var.cluster_type)
+ error_message = "The value of cluster_type must be either 'aws' or 'azure'."
+ }
+}
#####################
# Create Hadoop Key #
@@ -68,7 +86,10 @@ locals {
accumulo_branch_name = var.accumulo_branch_name
accumulo_version = var.accumulo_version
authorized_ssh_keys = local.ssh_keys[*]
+ lvm_mount_point = var.lvm_mount_point != null ? var.lvm_mount_point : ""
+ lvm_disk_count = var.lvm_disk_count != null ? var.lvm_disk_count : ""
os_type = var.os_type
+ cluster_type = var.cluster_type
hadoop_public_key = indent(6, tls_private_key.hadoop.public_key_openssh)
hadoop_private_key = indent(6, tls_private_key.hadoop.private_key_pem)
})
@@ -97,5 +118,3 @@ data "cloudinit_config" "cfg" {
output "cloud_init_data" {
value = data.cloudinit_config.cfg.rendered
}
-
-
diff --git a/contrib/terraform-testing-infrastructure/modules/cloud-init-config/templates/cloud-init.tftpl b/contrib/terraform-testing-infrastructure/modules/cloud-init-config/templates/cloud-init.tftpl
index 30cc806..0172e40 100644
--- a/contrib/terraform-testing-infrastructure/modules/cloud-init-config/templates/cloud-init.tftpl
+++ b/contrib/terraform-testing-infrastructure/modules/cloud-init-config/templates/cloud-init.tftpl
@@ -75,6 +75,9 @@ packages:
# Make directories on each node
#
runcmd:
+%{ if cluster_type == "azure" && lvm_mount_point != "" ~}
+ - /usr/local/bin/format-lvm-data-disk.sh ${lvm_disk_count} ${lvm_mount_point} hadoop.hadoop
+%{ endif ~}
- mkdir -p ${software_root} ${zookeeper_dir} ${hadoop_dir} ${accumulo_dir}
- chown hadoop.hadoop ${software_root} ${zookeeper_dir} ${hadoop_dir} ${accumulo_dir}
- systemctl enable docker
@@ -149,3 +152,9 @@ write_files:
permissions: '0755'
content: |
${indent(6, file("${files_path}/update-hosts-genders.sh"))}
+%{ if cluster_type == "azure" ~}
+ - path: /usr/local/bin/format-lvm-data-disk.sh
+ permissions: '0755'
+ content: |
+ ${indent(6, file("${files_path}/azure-format-lvm-data-disk.sh"))}
+%{ endif ~}
diff --git a/contrib/terraform-testing-infrastructure/modules/config-files/templates/zoo.cfg.tftpl b/contrib/terraform-testing-infrastructure/modules/config-files/templates/zoo.cfg.tftpl
index 3eba821..1de64d8 100644
--- a/contrib/terraform-testing-infrastructure/modules/config-files/templates/zoo.cfg.tftpl
+++ b/contrib/terraform-testing-infrastructure/modules/config-files/templates/zoo.cfg.tftpl
@@ -9,7 +9,7 @@ syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
-dataDir=${zookeeper_dir}
+dataDir=${zookeeper_dir}/data
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.