You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@ambari.apache.org by ol...@apache.org on 2018/05/15 01:10:19 UTC

[ambari] branch trunk updated: AMBARI-23822. Document Log Search / Atlas Solr collection migration as well

This is an automated email from the ASF dual-hosted git repository.

oleewere pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/ambari.git


The following commit(s) were added to refs/heads/trunk by this push:
     new 8dde330  AMBARI-23822. Document Log Search / Atlas Solr collection migration as well
8dde330 is described below

commit 8dde33057bb8d094523cb33fcca57ca433f44d16
Author: Oliver Szabo <ol...@gmail.com>
AuthorDate: Tue May 15 03:08:56 2018 +0200

    AMBARI-23822. Document Log Search / Atlas Solr collection migration as well
---
 ambari-infra/ambari-infra-solr-client/README.md | 450 +++++++++++++++++++-----
 1 file changed, 366 insertions(+), 84 deletions(-)

diff --git a/ambari-infra/ambari-infra-solr-client/README.md b/ambari-infra/ambari-infra-solr-client/README.md
index 1111782..d20c42b 100644
--- a/ambari-infra/ambari-infra-solr-client/README.md
+++ b/ambari-infra/ambari-infra-solr-client/README.md
@@ -21,74 +21,52 @@ limitations under the License.
 
 CLI helper tool(s) for Ambari Infra Solr.
 
-### Solr Migration Helper (Solr 5.x to 7.x)
-
-`/usr/lib/ambari-infra-solr-client/migrationHelper.py --help`
-
-```text
-Usage: migrationHelper.py [options]
-
-Options:
-  -h, --help            show this help message and exit
-  -H HOST, --host=HOST  hostname for ambari server
-  -P PORT, --port=PORT  port number for ambari server
-  -c CLUSTER, --cluster=CLUSTER
-                        name cluster
-  -s, --ssl             use if ambari server using https
-  -u USERNAME, --username=USERNAME
-                        username for accessing ambari server
-  -p PASSWORD, --password=PASSWORD
-                        password for accessing ambari server
-  -a ACTION, --action=ACTION
-                        backup | restore | migrate
-  -f, --force           force index upgrade even if it's the right version
-  --index-location=INDEX_LOCATION
-                        location of the index backups
-  --backup-name=BACKUP_NAME
-                        backup name of the index
-  --collection=COLLECTION
-                        solr collection
-  --version=INDEX_VERSION
-                        lucene index version for migration (6.6.2 or 7.3.0)
-  --request-tries=REQUEST_TRIES
-                        number of tries for BACKUP/RESTORE status api calls in
-                        the request
-  --request-time-interval=REQUEST_TIME_INTERVAL
-                        time interval between BACKUP/RESTORE status api calls
-                        in the request
-  --request-async       skip BACKUP/RESTORE status api calls from the command
-  --shared-fs           shared fs for storing backup (will create index
-                        location to <path><hostname>)
-  --solr-hosts=SOLR_HOSTS
-                        comma separated list of solr hosts
-  --disable-solr-host-check
-                        Disable to check solr hosts are good for the
-                        collection backups
-  --core-filter=CORE_FILTER
-                        core filter for replica folders
-  --skip-cores=SKIP_CORES
-                        specific cores to skip (comma separated)
-  --shards=SOLR_SHARDS  number of shards (required to set properly for
-                        restore)
-  --solr-hdfs-path=SOLR_HDFS_PATH
-                        Base path of Solr (where collections are located) if
-                        HDFS is used (like /user/infra-solr)
-  --solr-keep-backup    If it is turned on, Snapshot Solr data will not be
-                        deleted from the filesystem during restore.
-```
-
-#### I. Backup/Migrate/Restore Ranger collection (Ambari 2.6.x to Ambari 2.7.x)
-
-Before you start to upgrade process, check how many shards you have for Ranger collection, in order to know later how many shards you need to create for the collection where you will store the migrated index. Also make sure you have stable shards (at least one core is up and running)
-
-##### 1. Upgrade Ambari Infra Solr Client
+### Post Ambari Server Upgrade (Ambari 2.7.x)
+
+Ambari Infra Solr uses Solr 7 from Ambari 2.7.0, therefore it is required migrate Solr 5 index (Ambari Infra 2.6.x), if you want to keep your old data. (otherwise backup part can be skipped)
+
+#### Contents:
+- [I. Upgrade Ambari Infra Solr Clients](#i.-upgrade-ambari-infra-solr-client)
+- [II. Backup Solr Collections](#ii.-backup-collections-(ambari-2.6.x-to-ambari-2.7.x))
+    - a.) If you have Ranger Ambari service with Solr audits:
+        - [1. Backup Ranger collection](#ii/1.-backup-ranger-collection)
+        - [2. Backup Ranger configs on Solr ZNode](#ii/2.-backup-ranger-configs-on-solr-znode)
+        - [3. Delete Ranger collection](#ii/3.-delete-ranger-collection)
+        - [4. Upgrade Ranger Solr schema](#ii/4.-upgrade-ranger-solr-schema)
+    - b.) If you have Atlas Ambari service:
+        - [5. Backup Atlas collections](#ii/5.-backup-atlas-collections)
+        - [6. Delete Atlas collections](#ii/6.-delete-atlas-collections)
+    - c.) If you have Log Search Ambari service:
+        - [7. Delete Log Search collections](#ii/7.-delete-log-search-collections)
+        - [8. Delete Log Search Solr configs](#ii/8.-delete-log-search-solr-configs)
+- [III. Upgrade Ambari Infra Solr package](#iii.-upgrade-infra-solr-packages)
+- [IV. Re-create Solr Collections](#iv.-re-create-ranger-collection)
+- [V. Migrate Solr Collections](#v.-migrate-solr-collections)
+    - a.) If you have Ranger Ambari service with Solr audits:
+        - [1. Migrate Ranger Solr collection](#v/1.-migrate-ranger-collections)
+    - b.) If you have Atlas Ambari service:
+        - [2. Migrate Atlas Solr collections](#v/2.-migrate-atlas-collections)
+- [VI. Restore Solr Collections](#vi.-restore-collections)
+    - a.) If you have Ranger Ambari service with Solr audits:
+        - [1. Restore old Ranger collection](#vi/1.-restore-old-ranger-collection)
+        - [2. Reload restored Ranger collection](#vi/2.-reload-restored-collection)
+        - [3. Transport old data to Ranger collection](#vi/3.-transport-old-data-to-ranger-collection)
+    - b.) If you have Atlas Ambari service:
+        - [4. Restore old Atlas collections](#vi/4.-restore-old-atlas-collections)
+        - [5. Reload restored Atlas collections](#vi/5.-reload-restored-atlas-collections)
+        - [6. Transport old data to Atlas collections](#vi/6.-transport-old-data-to-atlas-collections)
+#### I. Upgrade Ambari Infra Solr Client
 
 First make sure `ambari-infra-solr-client` is the latest. (If its before 2.7.x) It will contain the migrationHelper.py script at `/usr/lib/ambari-infra-solr-client` location. 
 Also make sure you won't upgrade `ambari-infra-solr` until the migration has not done. (all of this should happen after `ambari-server` upgrade, also make sure to not restart `INFRA_SOLR` instances)
 
-##### 2. Backup Ranger collection
+### II. Backup collections (Ambari 2.6.x to Ambari 2.7.x)
 
-Use `/usr/lib/ambari-infra-solr-client/migrationHelper.py` script to backup the ranger collection.
+Before you start to upgrade process, check how many shards you have for Ranger collection, in order to know later how many shards you need to create for the collection where you will store the migrated index. Also make sure you have stable shards (at least one core is up and running) and will have enough space on the disks to store Solr backup data.
+
+#### II/1. Backup Ranger collection
+
+Use [migrationHelper.py](#solr-migration-helper-script) script to backup the ranger collection.
 
 ```bash
 # collection parameters
@@ -129,7 +107,9 @@ mkdir -p $BACKUP_PATH
 curl --negotiate -k -u : "$SOLR_URL/$BACKUP_CORE/replication?command=BACKUP&location=$BACKUP_PATH&name=$BACKUP_CORE_NAME"
 ```
 
-##### 3. Backuo Ranger configs on Solr ZNode
+(help: [get core names](#get-core-/-shard-names-with-hosts))
+
+#### II/2. Backup Ranger configs on Solr ZNode
 
 Next you can copy `ranger_audits` configs to a different znode, in order to keep the old schema.
 
@@ -141,7 +121,7 @@ export ZK_CONN_STR=... # without znode, e.g.: myhost1:2181,myhost2:2181,myhost3:
 infra-solr-cloud-cli --transfer-znode -z $ZK_CONN_STR --jaas-file /etc/ambari-infra-solr/conf/infra_solr_jaas.conf --copy-src /infra-solr/configs/ranger_audits --copy-dest /infra-solr/configs/old_ranger_audits
 ```
 
-##### 4. Delete Ranger Collection
+#### II/3. Delete Ranger collection
 
 At this point you can delete the actual Ranger collection with this command:
 
@@ -156,7 +136,7 @@ kinit -kt /etc/security/keytabs/ambari-infra-solr.service.keytab $(whoami)/$(hos
 curl --negotiate -k -u : "$SOLR_URL/admin/collections?action=DELETE&name=$COLLECTION_NAME" 
 ```
 
-##### 5. Upgrade Ranger Solr schema
+#### II/4. Upgrade Ranger Solr schema
 
 Before creating the new Ranger collection, it is required to upgrade `managed-schema` configs.
 
@@ -182,24 +162,125 @@ wget -O managed-schema https://raw.githubusercontent.com/apache/ranger/master/se
 # Upload the new schema
 /usr/lib/ambari-infra-solr/server/scripts/cloud-scripts/zkcli.sh --zkhost "${ZK_HOST}" -cmd putfile /configs/ranger_audits/managed-schema managed-schema
 ```
-##### 6. Upgrade Infra Solr Packages 
 
-At this step, you will need to upgrade ambari-infra-solr packages as well, but just after that you finished the backup and config upgrades for other collections as well (not just RANGER, do it for ATLAS and LOGSEARCH as well).
-So you will need to stop here, and only continue if you are ready with the backup + delete collection part with all of the collections.
+#### II/5. Backup Atlas collections
+
+Atlas has 3 collections: fulltext_index, edge_index, vertex_index.
+You will need to do similar steps that you did for Ranger, but you it is required to do for all 3 collection. (steps below is for fulltext_index)
 
-Example (for CentOS)
+```bash
+# collection parameters
+BACKUP_COLLECTION=fulltext_index
+BACKUP_NAME=fulltext_index
+# init ambari parameters
+AMBARI_SERVER_HOST=... # e.g.: c7401.ambari.apache.org
+AMBARI_SERVER_PORT=... # e.g.: 8080
+CLUSTER_NAME=... # e.g.: cl1
+AMBARI_USERNAME=... # e.g.: admin
+AMBARI_PASSWORD=... # e.g.: admin
+
+BACKUP_PATH=... # set a backup location like /tmp/fulltext_index_backup, the command should create that folder if not exists
+
+# use -s or --ssl option if ssl enabled for ambari-server
+
+/usr/lib/ambari-infra-solr-client/migrationHelper.py -H $AMBARI_SERVER_HOST -P $AMBARI_SERVER_PORT -c $CLUSTER_NAME -u $AMBARI_USERNAME -p $AMBARI_PASSWORD --action backup --index-location $BACKUP_PATH --collection $BACKUP_COLLECTION --backup-name $BACKUP_NAME
+```
+
+Also you can do the backup manually on every Solr node, by using [backup API of Solr](https://lucene.apache.org/solr/guide/6_6/making-and-restoring-backups.html). (use against core names, not collection name, it works as expected only if you have 1 shard on every node)
+
+Example:
+```bash
+
+su infra-solr
+SOLR_URL=... # actual solr host url, example: http://c6401.ambari.apache.org:8886/solr 
+# collection parameters
+BACKUP_PATH=... # backup location, e.g.: /tmp/fulltext_index_backup
+
+# RUN THIS FOR EVERY CORE ON SPECIFIC HOSTS !!!
+BACKUP_CORE=... # specific core on a host
+BACKUP_CORE_NAME=... # core names for backup -> <backup_location>/
+kinit -kt /etc/security/keytabs/ambari-infra-solr.service.keytab $(whoami)/$(hostname -f)
+mkdir -p $BACKUP_PATH
+
+curl --negotiate -k -u : "$SOLR_URL/$BACKUP_CORE/replication?command=BACKUP&location=$BACKUP_PATH&name=$BACKUP_CORE_NAME"
+```
+(help: [get core names](#get-core-/-shard-names-with-hosts))
+
+#### II/6. Delete Atlas collections
+
+Next step for Atlas is to delete all 3 old collections.
+
+```bash
+su infra-solr # infra-solr user - if you have a custom one, use that
+SOLR_URL=... # example: http://c6401.ambari.apache.org:8886/solr
+
+# use kinit and --negotiate option for curl only if the cluster is kerberized
+kinit -kt /etc/security/keytabs/ambari-infra-solr.service.keytab $(whoami)/$(hostname -f)
+
+COLLECTION_NAME=fulltext_index
+curl --negotiate -k -u : "$SOLR_URL/admin/collections?action=DELETE&name=$COLLECTION_NAME" 
+COLLECTION_NAME=edge_index
+curl --negotiate -k -u : "$SOLR_URL/admin/collections?action=DELETE&name=$COLLECTION_NAME" 
+COLLECTION_NAME=vertex_index
+curl --negotiate -k -u : "$SOLR_URL/admin/collections?action=DELETE&name=$COLLECTION_NAME" 
+```
+
+#### II/7. Delete Log Search collections
+
+For Log Search, it is a must to delete the old collections.
+
+```bash
+su infra-solr # infra-solr user - if you have a custom one, use that
+SOLR_URL=... # example: http://c6401.ambari.apache.org:8886/solr
+
+# use kinit and --negotiate option for curl only if the cluster is kerberized
+kinit -kt /etc/security/keytabs/ambari-infra-solr.service.keytab $(whoami)/$(hostname -f)
+
+COLLECTION_NAME=hadoop_logs
+curl --negotiate -k -u : "$SOLR_URL/admin/collections?action=DELETE&name=$COLLECTION_NAME" 
+COLLECTION_NAME=audit_logs
+curl --negotiate -k -u : "$SOLR_URL/admin/collections?action=DELETE&name=$COLLECTION_NAME" 
+COLLECTION_NAME=history
+curl --negotiate -k -u : "$SOLR_URL/admin/collections?action=DELETE&name=$COLLECTION_NAME" 
+```
+
+#### II/8. Delete Log Search Solr configs
+
+Log Search configs are changed a lot between Ambari 2.6.x and Ambari 2.7.x, so it is required to delete those as well. (configs will be regenerated during Log Search startup)
+
+```bash
+su infra-solr # infra-solr user - if you have a custom one, use that
+# ZOOKEEPER CONNECTION STRING from zookeeper servers
+export ZK_CONN_STR=... # without znode,e.g.: myhost1:2181,myhost2:2181,myhost3:2181 
+
+kinit -kt /etc/security/keytabs/ambari-infra-solr.service.keytab $(whoami)/$(hostname -f)
+
+zookeeper-client -server $ZK_CONN_STR rmr /infra-solr/configs/hadoop_logs
+zookeeper-client -server $ZK_CONN_STR rmr /infra-solr/configs/audit_logs
+zookeeper-client -server $ZK_CONN_STR rmr /infra-solr/configs/history
+```
+
+### III. Upgrade Infra Solr packages
+
+At this step, you will need to upgrade `ambari-infra-solr` packages. (also make sure ambari-logsearch* packages are upgraded as well)
+
+Example (for CentOS):
 ```bash
 yum upgrade -y ambari-infra-solr
 ```
 
-##### 7. Re-create Ranger collections
+### IV. Re-create collections
+
+Restart Ranger Admin / Atlas / Log Search Ambari service, as the collections were deleted before, during startup, new collections will be created (as a Solr 7 collection).
+At this point you can stop, and do the migration / restore later (until you will have the backup), and go ahead with e.g. HDP upgrade. (migration part can take long - 1GB/min.)
 
-Just restart Ranger Admin service, as the collection was deleted before, during startup, the new Ranger Solr collection will be created (as a Solr 7 collection)
+### V. Migrate Solr Collections
 
+From this point, you can migrate your old index in the background. On every hosts, where there is a backup located, you can run luce index migration tool (packaged with ambari-infra-solr-client).. For lucene index migration, [migrationHelper.py](#solr-migration-helper-script) can be used, or `/usr/lib/ambari-infra-solr-client/solrIndexHelper.sh` directly. That script uses [IndexMigrationTool](#https://lucene.apache.org/solr/guide/7_3/indexupgrader-tool.html)
 
-##### 8. Migrate Ranger index
+#### V/1. Migrate Ranger collections
 
-From this point, you can migrate your old index in the background. On every hosts, where there is a backup located, you can run luce index migration tool (packaged with ambari-infra-solr-client).
+Migration for `ranger_audits` collection (cores):
 
 ```bash
 # init ambari parameters
@@ -229,9 +310,46 @@ infra-lucene-index-tool upgrade-index -d /tmp/ranger-backup -f -b -g
 
 By default, the tool will migrate from lucene version 5 to lucene version 6.6.0. (that's ok for Solr 7) If you want a lucene 7 index, you will need to re-run the migration tool command with `-v 7.3.0` option. 
 
-##### 9. Restore Old Ranger Collection
+#### V/2. Migrate Atlas collections
+
+As Atlas has 3 collections, you will need similar steps that is required for Ranger, just for all 3 collections.
+(fulltext_index, edge_index, vertex_index)
+
+Example with fulltext_index:
+
+```bash
+# init ambari parameters
+AMBARI_SERVER_HOST=...
+AMBARI_SERVER_PORT=...
+CLUSTER_NAME=...
+AMBARI_USERNAME=...
+AMBARI_PASSWORD=...
+
+BACKUP_PATH=... # will run migration on every folder which contains *snapshot* in its name
+BACKUP_COLLECTION=fulltext_index # collection name - used for only logging
 
-After you finished your lucene data migration, you can restore your replicas on every hosts where you have the backups. But we need to restore the old data to a new collection, so first you will need to create that: (on a host where you have an installed Infra Solr component). For Ranger, use old_ranger_audits config set that you backup up during Solr schema config upgrade step. (set this as CONFIG_NAME), to make that collection to work with Solr 7, you need to copy your solrconfig.xml as well.
+# use -s or --ssl option if ssl enabled for ambari-server
+/usr/lib/ambari-infra-solr-client/migrationHelper.py -H $AMBARI_SERVER_HOST -P $AMBARI_SERVER_PORT -c $CLUSTER_NAME -u $AMBARI_USERNAME -p $AMBARI_PASSWORD --action migrate --index-location $BACKUP_PATH --collection $BACKUP_COLLECTION
+```
+
+Or you can run commands manually on nodes where your backups are located:
+```bash
+
+export JAVA_HOME=/usr/jdk64/1.8.0_112
+
+# if /tmp/fulltext_index_backup is your backup location
+infra-lucene-index-tool upgrade-index -d /tmp/fulltext_index_backup -f -b -g
+
+# with 'infra-lucene-index-tool help' command you can checkout the command line options
+```
+
+By default, the tool will migrate from lucene version 5 to lucene version 6.6.0. (that's ok for Solr 7) If you want a lucene 7 index, you will need to re-run the migration tool command with `-v 7.3.0` option. 
+
+### VI. Restore Collections
+
+#### VI/1. Restore Old Ranger collection
+
+After lucene data migration is finished, you can restore your replicas on every hosts where you have the backups. But we need to restore the old data to a new collection, so first you will need to create that: (on a host where you have an installed Infra Solr component). For Ranger, use old_ranger_audits config set that you backup up during Solr schema config upgrade step. (set this as CONFIG_NAME), to make that collection to work with Solr 7, you need to copy your solrconfig.xml as well.
 
 Create a collection for restoring the backup (`old_ranger_audits`)
 ```bash
@@ -241,7 +359,7 @@ NUM_SHARDS=... # use that number that was used for the old collection - importan
 NUM_REP=1 # can be more, but 1 is recommended for that temp collection
 MAX_SHARDS_PER_NODE=... # use that number that was used for the old collection
 CONFIG_NAME=old_ranger_audits
-OLD_DATA_COLLECTION=old_ranger_audit
+OLD_DATA_COLLECTION=old_ranger_audits
 
 # kinit only if kerberos is enabled for tha cluster
 kinit -kt /etc/security/keytabs/ambari-infra-solr.service.keytab $(whoami)/$(hostname -f)
@@ -257,6 +375,7 @@ curl --negotiate -k -u : "$SOLR_URL/admin/collections?action=CREATE&name=$OLD_DA
 ```
 
 Restore the collection:
+(important note: you will need to add `--solr-hdfs-path` option if your index is on HDFS (value can be like: `/user/infra-solr`), which should be the location where your collections are located.)
 ```bash
 # init ambari parameters
 AMBARI_SERVER_HOST=...
@@ -274,9 +393,7 @@ NUM_SHARDS=... # important, use a proper number, that will be stored in core.pro
 /usr/lib/ambari-infra-solr-client/migrationHelper.py -H $AMBARI_SERVER_HOST -P $AMBARI_SERVER_PORT -c $CLUSTER_NAME -u $AMBARI_USERNAME -p $AMBARI_PASSWORD --action restore --index-location $BACKUP_PATH --collection $OLD_BACKUP_COLLECTION --backup-name $BACKUP_NAME --shards $NUM_SHARDS
 ```
 
-You will need to add `--solr-hdfs-path` option if your index is on HDFS (value can be like: `/user/infra-solr`), which should be the location where your collections are located.
-
-Also you can manually run restore commands:
+Also you can manually run restore commands: ([get core names](#get-core-/-shard-names-with-hosts))
 
 ```bash
 su infra-solr
@@ -292,7 +409,7 @@ curl --negotiate -k -u : "$SOLR_URL/$OLD_BACKUP_COLLECTION_CORE/replication?comm
 
 Or use simple `cp` or `hdfs dfs -put` commands to copy the migrated cores to the right places.
 
-##### 10. Reload restored collection
+#### VI/2. Reload restored collection
 
 After the cores are restored you will need to reload the old_ranger_audits collection:
 
@@ -306,9 +423,9 @@ kinit -kt /etc/security/keytabs/ambari-infra-solr.service.keytab $(whoami)/$(hos
 curl --negotiate -k -u : "$SOLR_URL/admin/collecions?action=RELOAD&name=$OLD_RANGER_COLLECTION"
 ```
 
-##### 11. Transport old data to ranger_audits collection
+#### VI/3. Transport old data to Ranger collection
 
-In the end, you end up with 2 collections (ranger_audits and old_ranger_audits), in order to drop the restored one, you will need to transfer your old data to the new collection. To achieve this, you can use `solrDataManager.py`, which is located next to the `migrationHelper.py` script
+In the end, you end up with 2 collections (ranger_audits and old_ranger_audits), in order to drop the restored one, you will need to transfer your old data to the new collection. To achieve this, you can use [solrDataManager.py](#solr-data-manager-script), which is located next to the `migrationHelper.py` script
 
 ```bash
 # Init values:
@@ -329,7 +446,172 @@ infra-solr-data-manager -m archive -v -c $OLD_COLLECTION -s $SOLR_URL -z none -r
 nohup infra-solr-data-manager -m archive -v -c $OLD_COLLECTION -s $SOLR_URL -z none -r 10000 -w 100000 -f $DATE_FIELD -e $END_DATE --solr-output-collection $ACTIVE_COLLECTION -k $INFRA_SOLR_KEYTAB -n $INFRA_SOLR_PRINCIPAL --exclude-fields $EXCLUDE_FIELDS > /tmp/solr-data-mgr.log 2>&1>& echo $! > /tmp/solr-data-mgr.pid
 ```
 
-### Solr Data Manager
+#### VI/4. Restore Old Atlas collections
+
+For Atlas, use `old_` prefix for all 3 collections that you need to create  and use `atlas_configs` config set.
+
+Create a collection for restoring the backup (`old_ranger_audits`)
+```bash
+su infra-solr # infra-solr user - if you have a custom one, use that
+SOLR_URL=... # example: http://c6401.ambari.apache.org:8886/solr
+NUM_SHARDS=... # use that number that was used for the old collection - important to use at least that many that you have originally before backup
+NUM_REP=1 # can be more, but 1 is recommended for that temp collection
+MAX_SHARDS_PER_NODE=... # use that number that was used for the old collection
+CONFIG_NAME=atlas_configs
+
+# kinit only if kerberos is enabled for tha cluster
+kinit -kt /etc/security/keytabs/ambari-infra-solr.service.keytab $(whoami)/$(hostname -f)
+
+OLD_DATA_COLLECTION=old_fulltext_index
+curl --negotiate -k -u : "$SOLR_URL/admin/collections?action=CREATE&name=$OLD_DATA_COLLECTION&numShards=$NUM_SHARDS&replicationFactor=$NUM_REP&maxShardsPerNode=$MAX_SHARDS_PER_NODE&collection.configName=$CONFIG_NAME"
+OLD_DATA_COLLECTION=old_edge_index
+curl --negotiate -k -u : "$SOLR_URL/admin/collections?action=CREATE&name=$OLD_DATA_COLLECTION&numShards=$NUM_SHARDS&replicationFactor=$NUM_REP&maxShardsPerNode=$MAX_SHARDS_PER_NODE&collection.configName=$CONFIG_NAME"
+OLD_DATA_COLLECTION=old_vertex_index
+curl --negotiate -k -u : "$SOLR_URL/admin/collections?action=CREATE&name=$OLD_DATA_COLLECTION&numShards=$NUM_SHARDS&replicationFactor=$NUM_REP&maxShardsPerNode=$MAX_SHARDS_PER_NODE&collection.configName=$CONFIG_NAME"
+```
+
+Restore the collection(s): 
+(important note: you will need to add `--solr-hdfs-path` option if your index is on HDFS (value can be like: `/user/infra-solr`), which should be the location where your collections are located.)
+Example with fulltext_index: (do the same for old_vertex_index and old_edge_index)
+```bash
+# init ambari parameters
+AMBARI_SERVER_HOST=...
+AMBARI_SERVER_PORT=...
+CLUSTER_NAME=...
+AMBARI_USERNAME=...
+AMBARI_PASSWORD=...
+
+OLD_BACKUP_COLLECTION=old_fulltext_index
+BACKUP_NAME=fulltext_index # or what you set before for backup name during backup step
+BACKUP_PATH=... # backup location, e.g.: /tmp/fulltext_index-backup
+NUM_SHARDS=... # important, use a proper number, that will be stored in core.properties files
+
+# use -s or --ssl option if ssl enabled for ambari-server
+/usr/lib/ambari-infra-solr-client/migrationHelper.py -H $AMBARI_SERVER_HOST -P $AMBARI_SERVER_PORT -c $CLUSTER_NAME -u $AMBARI_USERNAME -p $AMBARI_PASSWORD --action restore --index-location $BACKUP_PATH --collection $OLD_BACKUP_COLLECTION --backup-name $BACKUP_NAME --shards $NUM_SHARDS
+```
+
+Also you can manually run restore commands: ([get core names](#get-core-/-shard-names-with-hosts))
+
+```bash
+su infra-solr
+SOLR_URL=... # actual solr host url, example: http://c6401.ambari.apache.org:8886/solr 
+BACKUP_PATH=... # backup location, e.g.: /tmp/fulltext_index-backup
+
+OLD_BACKUP_COLLECTION_CORE=... # choose a core to restore
+BACKUP_CORE_NAME=... # choose a core from backup cores - you can find these names as : <backup_location>/snapshot.$BACKUP_CORE_NAME
+
+kinit -kt /etc/security/keytabs/ambari-infra-solr.service.keytab $(whoami)/$(hostname -f)
+curl --negotiate -k -u : "$SOLR_URL/$OLD_BACKUP_COLLECTION_CORE/replication?command=RESTORE&location=$BACKUP_PATH&name=$BACKUP_CORE_NAME"
+```
+
+Or use simple `cp` or `hdfs dfs -put` commands to copy the migrated cores to the right places.
+
+#### VI/5. Reload restored Atlas collections
+
+After the cores are restored you will need to reload the all 3 Atlas collections:
+
+```bash
+su infra-solr
+SOLR_URL=... # actual solr host url, example: http://c6401.ambari.apache.org:8886/solr 
+
+# use kinit only if kerberos is enabled
+kinit -kt /etc/security/keytabs/ambari-infra-solr.service.keytab $(whoami)/$(hostname -f)
+
+OLD_BACKUP_COLLECTION=old_fulltext_index
+curl --negotiate -k -u : "$SOLR_URL/admin/collecions?action=RELOAD&name=$OLD_BACKUP_COLLECTION"
+OLD_BACKUP_COLLECTION=old_edge_index
+curl --negotiate -k -u : "$SOLR_URL/admin/collecions?action=RELOAD&name=$OLD_BACKUP_COLLECTION"
+OLD_BACKUP_COLLECTION=old_vertex_index
+curl --negotiate -k -u : "$SOLR_URL/admin/collecions?action=RELOAD&name=$OLD_BACKUP_COLLECTION"
+```
+
+#### VI/6. Transport old data to Atlas collections
+
+In the end, you end up with 6 Atlas collections (vertex_index, old_vertex_index, edge_index, old_edge_index, fulltext_index, old_fulltext_index), in order to drop the restored one, you will need to transfer your old data to the new collection. To achieve this, you can use [solrDataManager.py](#solr-data-manager-script), which is located next to the `migrationHelper.py` script
+
+Example: (with fulltext_index, to the same with edge_index and vertex_index)
+```bash
+# Init values:
+SOLR_URL=... # example: http://c6401.ambari.apache.org:8886/solr
+INFRA_SOLR_KEYTAB=... # example: /etc/security/keytabs/ambari-infra-solr.service.keytab
+INFRA_SOLR_PRINCIPAL=... # example: infra-solr/$(hostname -f)@EXAMPLE.COM
+END_DATE=... # example: 2018-02-18T12:00:00.000Z , date until you export data
+
+OLD_COLLECTION=old_fulltext_index
+ACTIVE_COLLECTION=fulltext_index
+EXCLUDE_FIELDS=_version_ # comma separated exclude fields, at least _version_ is required
+
+DATE_FIELD=timestamp
+# infra-solr-data-manager is a symlink points to /usr/lib/ambari-infra-solr-client/solrDataManager.py
+infra-solr-data-manager -m archive -v -c $OLD_COLLECTION -s $SOLR_URL -z none -r 10000 -w 100000 -f $DATE_FIELD -e $END_DATE --solr-output-collection $ACTIVE_COLLECTION -k $INFRA_SOLR_KEYTAB -n $INFRA_SOLR_PRINCIPAL --exclude-fields $EXCLUDE_FIELDS
+
+# Or if you want to run the command in the background (with log and pid file):
+nohup infra-solr-data-manager -m archive -v -c $OLD_COLLECTION -s $SOLR_URL -z none -r 10000 -w 100000 -f $DATE_FIELD -e $END_DATE --solr-output-collection $ACTIVE_COLLECTION -k $INFRA_SOLR_KEYTAB -n $INFRA_SOLR_PRINCIPAL --exclude-fields $EXCLUDE_FIELDS > /tmp/solr-data-mgr.log 2>&1>& echo $! > /tmp/solr-data-mgr.pid
+```
+
+### APPENDIX
+
+#### Get core / shard names with hosts
+
+To get which hosts are related for your collections, you can check the Solr UI (using SPNEGO), or checkout get state.json details using a zookeeper-client or Solr zookeeper api to get state.json details of the collection (`/solr/admin/zookeeper?detail=true&path=/collections/<collection_name>/state.json`)
+
+#### Solr Migration Helper Script
+
+`/usr/lib/ambari-infra-solr-client/migrationHelper.py --help`
+
+```text
+Usage: migrationHelper.py [options]
+
+Options:
+  -h, --help            show this help message and exit
+  -H HOST, --host=HOST  hostname for ambari server
+  -P PORT, --port=PORT  port number for ambari server
+  -c CLUSTER, --cluster=CLUSTER
+                        name cluster
+  -s, --ssl             use if ambari server using https
+  -u USERNAME, --username=USERNAME
+                        username for accessing ambari server
+  -p PASSWORD, --password=PASSWORD
+                        password for accessing ambari server
+  -a ACTION, --action=ACTION
+                        backup | restore | migrate
+  -f, --force           force index upgrade even if it's the right version
+  --index-location=INDEX_LOCATION
+                        location of the index backups
+  --backup-name=BACKUP_NAME
+                        backup name of the index
+  --collection=COLLECTION
+                        solr collection
+  --version=INDEX_VERSION
+                        lucene index version for migration (6.6.2 or 7.3.0)
+  --request-tries=REQUEST_TRIES
+                        number of tries for BACKUP/RESTORE status api calls in
+                        the request
+  --request-time-interval=REQUEST_TIME_INTERVAL
+                        time interval between BACKUP/RESTORE status api calls
+                        in the request
+  --request-async       skip BACKUP/RESTORE status api calls from the command
+  --shared-fs           shared fs for storing backup (will create index
+                        location to <path><hostname>)
+  --solr-hosts=SOLR_HOSTS
+                        comma separated list of solr hosts
+  --disable-solr-host-check
+                        Disable to check solr hosts are good for the
+                        collection backups
+  --core-filter=CORE_FILTER
+                        core filter for replica folders
+  --skip-cores=SKIP_CORES
+                        specific cores to skip (comma separated)
+  --shards=SOLR_SHARDS  number of shards (required to set properly for
+                        restore)
+  --solr-hdfs-path=SOLR_HDFS_PATH
+                        Base path of Solr (where collections are located) if
+                        HDFS is used (like /user/infra-solr)
+  --solr-keep-backup    If it is turned on, Snapshot Solr data will not be
+                        deleted from the filesystem during restore.
+```
+
+#### Solr Data Manager Script
 
 `/usr/lib/ambari-infra-solr-client/solrDataManager.py --help`
 

-- 
To stop receiving notification emails like this one, please contact
oleewere@apache.org.