You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by knusbaum <gi...@git.apache.org> on 2017/09/26 20:27:19 UTC
[GitHub] storm pull request #2347: STORM-2760: Add Blobstore Migration Scripts
GitHub user knusbaum opened a pull request:
https://github.com/apache/storm/pull/2347
STORM-2760: Add Blobstore Migration Scripts
Add code and helper scripts for migrating active Storm clusters from a locally-backed BlobStore to an HDFS-backed BlobStore.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/knusbaum/incubator-storm blobstore-migrator
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/storm/pull/2347.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2347
----
commit 4af7da0ff48bd347aa49509744c0e6a8e749341f
Author: Kyle Nusbaum <kn...@yahoo-inc.com>
Date: 2017-09-26T20:17:32Z
Adding storm blobstore migrator.
----
---
[GitHub] storm issue #2347: STORM-2760: Add Blobstore Migration Scripts
Posted by knusbaum <gi...@git.apache.org>.
Github user knusbaum commented on the issue:
https://github.com/apache/storm/pull/2347
integration test failure looks unrelated.
---
[GitHub] storm pull request #2347: STORM-2760: Add Blobstore Migration Scripts
Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on a diff in the pull request:
https://github.com/apache/storm/pull/2347#discussion_r141183242
--- Diff: external/storm-blobstore-migration/pom.xml ---
@@ -0,0 +1,126 @@
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+ xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
+ <modelVersion>4.0.0</modelVersion>
+
+ <parent>
+ <artifactId>storm</artifactId>
+ <groupId>org.apache.storm</groupId>
+ <version>2.0.0-SNAPSHOT</version>
+ <relativePath>../../pom.xml</relativePath>
+ </parent>
+
+ <artifactId>blobstore-migrator</artifactId>
+ <packaging>jar</packaging>
+
+ <name>blobstore-migrator</name>
+ <url>http://maven.apache.org</url>
+ <dependencies>
+ <dependency>
+ <groupId>org.apache.storm</groupId>
+ <artifactId>storm-server</artifactId>
+ <version>${project.version}</version>
+ <exclusions>
+ <!--log4j-over-slf4j must be excluded for hadoop-minicluster
+ see: http://stackoverflow.com/q/20469026/3542091 -->
+ <exclusion>
+ <groupId>org.slf4j</groupId>
+ <artifactId>log4j-over-slf4j</artifactId>
+ </exclusion>
+ </exclusions>
+ </dependency>
+ <dependency>
+ <groupId>org.apache.storm</groupId>
+ <artifactId>storm-hdfs</artifactId>
+ <version>${project.version}</version>
+ <exclusions>
+ <!--log4j-over-slf4j must be excluded for hadoop-minicluster
+ see: http://stackoverflow.com/q/20469026/3542091 -->
+ <exclusion>
+ <groupId>org.slf4j</groupId>
+ <artifactId>log4j-over-slf4j</artifactId>
+ </exclusion>
+ </exclusions>
+ </dependency>
+ <!-- <dependency> -->
+ <!-- <artifactId>storm-core</artifactId> -->
+ <!-- <groupId>org.apache.storm</groupId> -->
+ <!-- <version>0.10.2.y</version> -->
+ <!-- </dependency> -->
+ <dependency>
+ <artifactId>hadoop-hdfs</artifactId>
+ <groupId>org.apache.hadoop</groupId>
+ <version>${hdfs.version}</version>
+ </dependency>
+ <dependency>
+ <artifactId>hadoop-client</artifactId>
+ <groupId>org.apache.hadoop</groupId>
+ <version>${hadoop.version}</version>
+ </dependency>
+ <dependency>
+ <artifactId>hadoop-common</artifactId>
+ <groupId>org.apache.hadoop</groupId>
+ <version>${hadoop.version}</version>
+ </dependency>
+ <dependency>
+ <groupId>yahoo.yinst.storm_hadoop_client_conf</groupId>
+ <artifactId>storm_hadoop_client_conf</artifactId>
+ <version>1.0.0.4</version>
+ </dependency>
+ </dependencies>
+ <repositories>
+ <repository>
+ <id>central-ymaven</id>
+ <url>http://ymaven.corp.yahoo.com:9999/proximity/repository/public</url>
+ <snapshots>
+ <enabled>true</enabled>
+ </snapshots>
+ <releases>
+ <enabled>true</enabled>
+ </releases>
+ </repository>
+ </repositories>
+ <build>
+ <plugins>
+ <plugin>
+ <artifactId>maven-compiler-plugin</artifactId>
+ <version>3.1</version>
+ <configuration>
+ <source>1.8</source>
+ <target>1.8</target>
+ </configuration>
+ </plugin>
+ <plugin>
+ <groupId>org.apache.maven.plugins</groupId>
+ <artifactId>maven-jar-plugin</artifactId>
+ <configuration>
+ <archive>
+ <manifest>
+ <mainClass>org.apache.storm.blobstore.MigratorMain</mainClass>
+ </manifest>
+ </archive>
+ </configuration>
+ </plugin>
--- End diff --
Indentation looks really off here.
---
[GitHub] storm pull request #2347: STORM-2760: Add Blobstore Migration Scripts
Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on a diff in the pull request:
https://github.com/apache/storm/pull/2347#discussion_r141183809
--- Diff: external/storm-blobstore-migration/src/main/java/org/apache/storm/blobstore/ListHDFS.java ---
@@ -0,0 +1,52 @@
+package org.apache.storm.blobstore;
+
+import java.util.Map;
+
+import javax.security.auth.Subject;
+
+import org.apache.storm.Config;
+import org.apache.storm.blobstore.ClientBlobStore;
+import org.apache.storm.hdfs.blobstore.HdfsBlobStore;
+import org.apache.storm.hdfs.blobstore.HdfsClientBlobStore;
+import org.apache.storm.utils.Utils;
+
+public class ListHDFS {
+
+ public static void main(String[] args) throws Exception {
--- End diff --
Could you file a follow on JIRA to migrate this to be a part of the storm admin command?
---
[GitHub] storm pull request #2347: STORM-2760: Add Blobstore Migration Scripts
Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on a diff in the pull request:
https://github.com/apache/storm/pull/2347#discussion_r141183361
--- Diff: external/storm-blobstore-migration/migrate.sh ---
@@ -0,0 +1,13 @@
+#!/usr/bin/env bash
--- End diff --
needs a license header
---
[GitHub] storm pull request #2347: STORM-2760: Add Blobstore Migration Scripts
Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on a diff in the pull request:
https://github.com/apache/storm/pull/2347#discussion_r141184296
--- Diff: external/storm-blobstore-migration/src/main/java/org/apache/storm/blobstore/ListHDFS.java ---
@@ -0,0 +1,52 @@
+package org.apache.storm.blobstore;
+
+import java.util.Map;
+
+import javax.security.auth.Subject;
+
+import org.apache.storm.Config;
+import org.apache.storm.blobstore.ClientBlobStore;
+import org.apache.storm.hdfs.blobstore.HdfsBlobStore;
+import org.apache.storm.hdfs.blobstore.HdfsClientBlobStore;
+import org.apache.storm.utils.Utils;
+
+public class ListHDFS {
+
+ public static void main(String[] args) throws Exception {
+ if(args.length < 1) {
+ System.out.println("Need at least 1 argument (hdfs_blobstore_path), but have " + Integer.toString(args.length));
+ System.out.println("listHDFS <hdfs_blobstore_path> <hdfs_principal> <keytab>");
+ System.out.println("Lists blobs in HdfsBlobStore");
+ System.out.println("Example: listHDFS 'hdfs://some-hdfs-namenode:8080/srv/storm/my-storm-blobstore' 'stormUser/my-nimbus-host.example.com@STORM.EXAMPLE.COM' '/srv/my-keytab/stormUser.kt'");
+ System.exit(1);
+ }
+
+ Map<String, Object> hdfsConf = Utils.readStormConfig();
+ String hdfsBlobstorePath = args[0];
+
+ hdfsConf.put(Config.BLOBSTORE_DIR, hdfsBlobstorePath);
+ hdfsConf.put(Config.STORM_PRINCIPAL_TO_LOCAL_PLUGIN, "org.apache.storm.security.auth.DefaultPrincipalToLocal");
+ if(args.length >= 2) {
+ System.out.println("SETTING HDFS PRINCIPAL!");
+ hdfsConf.put(Config.BLOBSTORE_HDFS_PRINCIPAL, args[1]);
+ }
+ if(args.length >= 3) {
+ System.out.println("SETTING HDFS KEYTAB!");
+ hdfsConf.put(Config.BLOBSTORE_HDFS_KEYTAB, args[2]);
+ }
+
+ /* CREATE THE BLOBSTORES */
+ System.out.println("Creating HDFS blobstore.");
+ HdfsBlobStore hdfsBlobStore = new HdfsBlobStore();
+ hdfsBlobStore.prepare(hdfsConf, null, null);
+ System.out.println("Done.");
--- End diff --
I'm not sure that users by default will expect the 3rd Done to mean we really are done. Could you make the messages a bit more descriptive? or remove them?
This goes for listing the local blobs as well.
---
[GitHub] storm pull request #2347: STORM-2760: Add Blobstore Migration Scripts
Posted by knusbaum <gi...@git.apache.org>.
Github user knusbaum commented on a diff in the pull request:
https://github.com/apache/storm/pull/2347#discussion_r141420947
--- Diff: external/storm-blobstore-migration/src/main/java/org/apache/storm/blobstore/ListHDFS.java ---
@@ -0,0 +1,52 @@
+package org.apache.storm.blobstore;
+
+import java.util.Map;
+
+import javax.security.auth.Subject;
+
+import org.apache.storm.Config;
+import org.apache.storm.blobstore.ClientBlobStore;
+import org.apache.storm.hdfs.blobstore.HdfsBlobStore;
+import org.apache.storm.hdfs.blobstore.HdfsClientBlobStore;
+import org.apache.storm.utils.Utils;
+
+public class ListHDFS {
+
+ public static void main(String[] args) throws Exception {
--- End diff --
[STORM-2763](https://issues.apache.org/jira/browse/STORM-2763)
---
[GitHub] storm pull request #2347: STORM-2760: Add Blobstore Migration Scripts
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/storm/pull/2347
---
[GitHub] storm pull request #2347: STORM-2760: Add Blobstore Migration Scripts
Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on a diff in the pull request:
https://github.com/apache/storm/pull/2347#discussion_r141183305
--- Diff: external/storm-blobstore-migration/src/main/java/org/apache/storm/blobstore/ListHDFS.java ---
@@ -0,0 +1,52 @@
+package org.apache.storm.blobstore;
--- End diff --
Needs a license header.
---
[GitHub] storm pull request #2347: STORM-2760: Add Blobstore Migration Scripts
Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on a diff in the pull request:
https://github.com/apache/storm/pull/2347#discussion_r141183394
--- Diff: external/storm-blobstore-migration/pom.xml ---
@@ -0,0 +1,126 @@
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
--- End diff --
needs a license header (all of the files do)
---
[GitHub] storm pull request #2347: STORM-2760: Add Blobstore Migration Scripts
Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on a diff in the pull request:
https://github.com/apache/storm/pull/2347#discussion_r141184831
--- Diff: external/storm-blobstore-migration/src/main/java/org/apache/storm/blobstore/MigrateBlobs.java ---
@@ -0,0 +1,125 @@
+package org.apache.storm.blobstore;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.util.Map;
+
+import javax.security.auth.Subject;
+import javax.security.auth.login.LoginContext;
+
+import org.apache.storm.Config;
+import org.apache.storm.blobstore.BlobStore;
+import org.apache.storm.hdfs.blobstore.HdfsBlobStore;
+import org.apache.storm.nimbus.NimbusInfo;
+import org.apache.storm.utils.Utils;
+import org.apache.storm.blobstore.LocalFsBlobStore;
+import org.apache.storm.generated.AuthorizationException;
+import org.apache.storm.generated.KeyAlreadyExistsException;
+import org.apache.storm.generated.KeyNotFoundException;
+import org.apache.storm.generated.ReadableBlobMeta;
+import org.apache.storm.generated.SettableBlobMeta;
+
+public class MigrateBlobs {
+
+ protected static void deleteAllBlobStoreKeys(BlobStore bs, Subject who) throws AuthorizationException, KeyNotFoundException {
+ Iterable<String> hdfsKeys = () -> bs.listKeys();
+ for(String key : hdfsKeys) {
+ System.out.println(key);
+ bs.deleteBlob(key, who);
+ }
+ }
+
+ protected static void copyBlobStoreKeys(BlobStore bsFrom, Subject whoFrom, BlobStore bsTo, Subject whoTo) throws AuthorizationException, KeyAlreadyExistsException, IOException, KeyNotFoundException {
+ Iterable<String> lfsKeys = () -> bsFrom.listKeys();
+ for(String key : lfsKeys) {
+ ReadableBlobMeta readable_meta = bsFrom.getBlobMeta(key, whoFrom);
+ SettableBlobMeta meta = readable_meta.get_settable();
+ InputStream in = bsFrom.getBlob(key, whoFrom);
+ System.out.println("COPYING BLOB " + key + " FROM " + bsFrom + " TO " + bsTo);
+ bsTo.createBlob(key, in, meta, whoTo);
+ System.out.println("DONE CREATING BLOB " + key);
+ }
+ }
+
+
+ public static void main(String[] args) throws Exception {
+ // TODO Auto-generated method stub
--- End diff --
Please remove this comment.
---
[GitHub] storm pull request #2347: STORM-2760: Add Blobstore Migration Scripts
Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on a diff in the pull request:
https://github.com/apache/storm/pull/2347#discussion_r141182693
--- Diff: external/storm-blobstore-migration/README.md ---
@@ -0,0 +1,87 @@
+# Blobstore Migrator
+
+## Basic Use
+-----
+
+### Build The Thing
+Use make to build a tarball with everything needed.
+```
+$ make
+```
+
+### Use The Thing
+Copy and extract the tarball
+```
+$ scp blobstore-migrator.tgz my-nimbus-host.example.com:~/
+$ ssh my-nimbus-host.example.com
+... On my-nimbus-host ...
+$ tar -xvzf blobstore-migrator.tgz
+```
+
+This will expand into a blobstore-migrator directory with all the scripts and the jar.
+```
+$ cd blobstore-migrator
+$ ls
+blobstore-migrator-2.0.jar listHDFS.sh listLocal.sh migrate.sh
+```
+
+To run, first create a config for the cluster.
+The config must be named 'config'
+It must contain definitions for `HDFS_BLOBSTORE_DIR`, `LOCAL_BLOBSTORE_DIR`, and `HADOOP_CLASSPATH`.
+Hadoop jars are packaged with neither storm nor this package, so they must be installed separately.
+
+Optional configs used to configure security are: `BLOBSTORE_PRINCIPAL`, `KEYTAB_FILE`, and `JAAS_CONF`
+
+Example:
+```
+$ cat config
+HDFS_BLOBSTORE_DIR='hdfs://some-hdfs-namenode:8080/srv/storm/my-storm-blobstore'
+LOCAL_BLOBSTORE_DIR='/srv/storm'
+HADOOP_CLASSPATH='/hadoop/share/hdfs/*:/hadoop/common/*'
+
+# My security configs:
+BLOBSTORE_PRINCIPAL='stormUser/my-nimbus-host.example.com@STORM.EXAMPLE.COM'
+KEYTAB_FILE='/srv/my-keytab/stormUser.kt'
+JAAS_CONF='/storm/conf/storm_jaas.conf'
+```
+
+Now you can run any of the scripts, all of which require config to exist:
+ - listHDFS.sh: lists all blobs currently in the HDFS Blobstore
+ - listLocal.sh: lists all blobs currently in the local Blobstore
+ - migrate.sh: Begins the migration process for Nimbus. (Read instructions below first)
+
+
+#### Migrating
+##### Nimbus
+To migrate blobs from nimbus, the following steps are necessary:
+
+1. Shut down Nimbus
--- End diff --
Can we make this shut down all nimbus instances, because of HA we cannot leave any up.
---
[GitHub] storm pull request #2347: STORM-2760: Add Blobstore Migration Scripts
Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on a diff in the pull request:
https://github.com/apache/storm/pull/2347#discussion_r141185443
--- Diff: external/storm-blobstore-migration/src/test/java/org/apache/storm/blobstore/AppTest.java ---
@@ -0,0 +1,38 @@
+package org.apache.storm.blobstore;
+
+import junit.framework.Test;
+import junit.framework.TestCase;
+import junit.framework.TestSuite;
+
+/**
+ * Unit test for simple App.
+ */
+public class AppTest
--- End diff --
Can we remove this file. Having a bogus test is worse then no test at all.
---
[GitHub] storm pull request #2347: STORM-2760: Add Blobstore Migration Scripts
Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on a diff in the pull request:
https://github.com/apache/storm/pull/2347#discussion_r141183179
--- Diff: external/storm-blobstore-migration/pom.xml ---
@@ -0,0 +1,126 @@
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+ xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
+ <modelVersion>4.0.0</modelVersion>
+
+ <parent>
+ <artifactId>storm</artifactId>
+ <groupId>org.apache.storm</groupId>
+ <version>2.0.0-SNAPSHOT</version>
+ <relativePath>../../pom.xml</relativePath>
+ </parent>
+
+ <artifactId>blobstore-migrator</artifactId>
+ <packaging>jar</packaging>
+
+ <name>blobstore-migrator</name>
+ <url>http://maven.apache.org</url>
+ <dependencies>
+ <dependency>
+ <groupId>org.apache.storm</groupId>
+ <artifactId>storm-server</artifactId>
+ <version>${project.version}</version>
+ <exclusions>
+ <!--log4j-over-slf4j must be excluded for hadoop-minicluster
+ see: http://stackoverflow.com/q/20469026/3542091 -->
+ <exclusion>
+ <groupId>org.slf4j</groupId>
+ <artifactId>log4j-over-slf4j</artifactId>
+ </exclusion>
+ </exclusions>
+ </dependency>
+ <dependency>
+ <groupId>org.apache.storm</groupId>
+ <artifactId>storm-hdfs</artifactId>
+ <version>${project.version}</version>
+ <exclusions>
+ <!--log4j-over-slf4j must be excluded for hadoop-minicluster
+ see: http://stackoverflow.com/q/20469026/3542091 -->
+ <exclusion>
+ <groupId>org.slf4j</groupId>
+ <artifactId>log4j-over-slf4j</artifactId>
+ </exclusion>
+ </exclusions>
+ </dependency>
+ <!-- <dependency> -->
+ <!-- <artifactId>storm-core</artifactId> -->
+ <!-- <groupId>org.apache.storm</groupId> -->
+ <!-- <version>0.10.2.y</version> -->
+ <!-- </dependency> -->
--- End diff --
Can we remove the commented out code?
---
[GitHub] storm pull request #2347: STORM-2760: Add Blobstore Migration Scripts
Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on a diff in the pull request:
https://github.com/apache/storm/pull/2347#discussion_r141182276
--- Diff: external/storm-blobstore-migration/Makefile ---
@@ -0,0 +1,19 @@
+PACKAGE_NAME=blobstore-migrator.tgz
+
+VERSION=$(shell cat VERSION || mvn help:evaluate -Dexpression=project.version | grep -v '^\[')
+
+all: $(PACKAGE_NAME)
+
+$(PACKAGE_NAME) : VERSION target/blobstore-migrator-$(VERSION).jar
+ -@rm -Rf blobstore-migrator $(PACKAGE_NAME)
+ mkdir blobstore-migrator
+ cp target/blobstore-migrator-$(VERSION).jar blobstore-migrator/
+ cp listHDFS.sh listLocal.sh migrate.sh VERSION blobstore-migrator/
+ tar -cvzf $(PACKAGE_NAME) blobstore-migrator
+# rm -Rf blobstore-migrator
--- End diff --
Can we delete the comment?
---