You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by knusbaum <gi...@git.apache.org> on 2017/09/26 20:27:19 UTC

[GitHub] storm pull request #2347: STORM-2760: Add Blobstore Migration Scripts

GitHub user knusbaum opened a pull request:

    https://github.com/apache/storm/pull/2347

    STORM-2760: Add Blobstore Migration Scripts

    Add code and helper scripts for migrating active Storm clusters from a locally-backed BlobStore to an HDFS-backed BlobStore.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/knusbaum/incubator-storm blobstore-migrator

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/storm/pull/2347.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2347
    
----
commit 4af7da0ff48bd347aa49509744c0e6a8e749341f
Author: Kyle Nusbaum <kn...@yahoo-inc.com>
Date:   2017-09-26T20:17:32Z

    Adding storm blobstore migrator.

----


---

[GitHub] storm issue #2347: STORM-2760: Add Blobstore Migration Scripts

Posted by knusbaum <gi...@git.apache.org>.
Github user knusbaum commented on the issue:

    https://github.com/apache/storm/pull/2347
  
    integration test failure looks unrelated.


---

[GitHub] storm pull request #2347: STORM-2760: Add Blobstore Migration Scripts

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2347#discussion_r141183242
  
    --- Diff: external/storm-blobstore-migration/pom.xml ---
    @@ -0,0 +1,126 @@
    +<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    +         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
    +    <modelVersion>4.0.0</modelVersion>
    +
    +    <parent>
    +        <artifactId>storm</artifactId>
    +        <groupId>org.apache.storm</groupId>
    +        <version>2.0.0-SNAPSHOT</version>
    +        <relativePath>../../pom.xml</relativePath>
    +    </parent>
    +    
    +    <artifactId>blobstore-migrator</artifactId>
    +    <packaging>jar</packaging>
    +    
    +    <name>blobstore-migrator</name>
    +    <url>http://maven.apache.org</url>
    +    <dependencies>
    +        <dependency>
    +            <groupId>org.apache.storm</groupId>
    +            <artifactId>storm-server</artifactId>
    +            <version>${project.version}</version>
    +            <exclusions>
    +                <!--log4j-over-slf4j must be excluded for hadoop-minicluster
    +                    see: http://stackoverflow.com/q/20469026/3542091 -->
    +                <exclusion>
    +                    <groupId>org.slf4j</groupId>
    +                    <artifactId>log4j-over-slf4j</artifactId>
    +                </exclusion>
    +            </exclusions>
    +        </dependency>
    +        <dependency>
    +            <groupId>org.apache.storm</groupId>
    +            <artifactId>storm-hdfs</artifactId>
    +            <version>${project.version}</version>
    +            <exclusions>
    +                <!--log4j-over-slf4j must be excluded for hadoop-minicluster
    +                    see: http://stackoverflow.com/q/20469026/3542091 -->
    +                <exclusion>
    +                    <groupId>org.slf4j</groupId>
    +                    <artifactId>log4j-over-slf4j</artifactId>
    +                </exclusion>
    +            </exclusions>
    +        </dependency>
    +        <!-- <dependency> -->
    +        <!--     <artifactId>storm-core</artifactId> -->
    +        <!--     <groupId>org.apache.storm</groupId> -->
    +        <!--     <version>0.10.2.y</version> -->
    +        <!-- </dependency> -->
    +        <dependency>
    +            <artifactId>hadoop-hdfs</artifactId>
    +            <groupId>org.apache.hadoop</groupId>
    +            <version>${hdfs.version}</version>
    +        </dependency>
    +        <dependency>
    +            <artifactId>hadoop-client</artifactId>
    +            <groupId>org.apache.hadoop</groupId>
    +            <version>${hadoop.version}</version>
    +        </dependency>
    +        <dependency>
    +            <artifactId>hadoop-common</artifactId>
    +            <groupId>org.apache.hadoop</groupId>
    +            <version>${hadoop.version}</version>
    +        </dependency>
    +        <dependency>
    +            <groupId>yahoo.yinst.storm_hadoop_client_conf</groupId>
    +            <artifactId>storm_hadoop_client_conf</artifactId>
    +            <version>1.0.0.4</version>
    +        </dependency>
    +    </dependencies>
    +    <repositories>
    +        <repository>
    +            <id>central-ymaven</id>
    +            <url>http://ymaven.corp.yahoo.com:9999/proximity/repository/public</url>
    +            <snapshots>
    +                <enabled>true</enabled>
    +            </snapshots>
    +            <releases>
    +                <enabled>true</enabled>
    +            </releases>
    +        </repository>
    +    </repositories>
    +    <build>
    +        <plugins>
    +            <plugin>
    +                <artifactId>maven-compiler-plugin</artifactId>
    +                <version>3.1</version>
    +                <configuration>
    +                    <source>1.8</source>
    +                    <target>1.8</target>
    +                </configuration>
    +            </plugin>
    +            <plugin>
    +				<groupId>org.apache.maven.plugins</groupId>
    +				<artifactId>maven-jar-plugin</artifactId>
    +				<configuration>
    +					<archive>
    +						<manifest>
    +							<mainClass>org.apache.storm.blobstore.MigratorMain</mainClass>
    +						</manifest>
    +					</archive>
    +				</configuration>
    +			</plugin>
    --- End diff --
    
    Indentation looks really off here.


---

[GitHub] storm pull request #2347: STORM-2760: Add Blobstore Migration Scripts

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2347#discussion_r141183809
  
    --- Diff: external/storm-blobstore-migration/src/main/java/org/apache/storm/blobstore/ListHDFS.java ---
    @@ -0,0 +1,52 @@
    +package org.apache.storm.blobstore;
    +
    +import java.util.Map;
    +
    +import javax.security.auth.Subject;
    +
    +import org.apache.storm.Config;
    +import org.apache.storm.blobstore.ClientBlobStore;
    +import org.apache.storm.hdfs.blobstore.HdfsBlobStore;
    +import org.apache.storm.hdfs.blobstore.HdfsClientBlobStore;
    +import org.apache.storm.utils.Utils;
    +
    +public class ListHDFS {
    +    
    +    public static void main(String[] args) throws Exception {
    --- End diff --
    
    Could you file a follow on JIRA to migrate this to be a part of the storm admin command?


---

[GitHub] storm pull request #2347: STORM-2760: Add Blobstore Migration Scripts

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2347#discussion_r141183361
  
    --- Diff: external/storm-blobstore-migration/migrate.sh ---
    @@ -0,0 +1,13 @@
    +#!/usr/bin/env bash
    --- End diff --
    
    needs a license header


---

[GitHub] storm pull request #2347: STORM-2760: Add Blobstore Migration Scripts

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2347#discussion_r141184296
  
    --- Diff: external/storm-blobstore-migration/src/main/java/org/apache/storm/blobstore/ListHDFS.java ---
    @@ -0,0 +1,52 @@
    +package org.apache.storm.blobstore;
    +
    +import java.util.Map;
    +
    +import javax.security.auth.Subject;
    +
    +import org.apache.storm.Config;
    +import org.apache.storm.blobstore.ClientBlobStore;
    +import org.apache.storm.hdfs.blobstore.HdfsBlobStore;
    +import org.apache.storm.hdfs.blobstore.HdfsClientBlobStore;
    +import org.apache.storm.utils.Utils;
    +
    +public class ListHDFS {
    +    
    +    public static void main(String[] args) throws Exception {
    +        if(args.length < 1) {
    +            System.out.println("Need at least 1 argument (hdfs_blobstore_path), but have " + Integer.toString(args.length));
    +            System.out.println("listHDFS <hdfs_blobstore_path> <hdfs_principal> <keytab>");
    +            System.out.println("Lists blobs in HdfsBlobStore");
    +            System.out.println("Example: listHDFS 'hdfs://some-hdfs-namenode:8080/srv/storm/my-storm-blobstore' 'stormUser/my-nimbus-host.example.com@STORM.EXAMPLE.COM' '/srv/my-keytab/stormUser.kt'");
    +            System.exit(1);
    +        }
    +        
    +        Map<String, Object> hdfsConf = Utils.readStormConfig();
    +        String hdfsBlobstorePath = args[0];
    +        
    +        hdfsConf.put(Config.BLOBSTORE_DIR, hdfsBlobstorePath);
    +        hdfsConf.put(Config.STORM_PRINCIPAL_TO_LOCAL_PLUGIN, "org.apache.storm.security.auth.DefaultPrincipalToLocal");
    +        if(args.length >= 2) {
    +        	System.out.println("SETTING HDFS PRINCIPAL!");
    +        	hdfsConf.put(Config.BLOBSTORE_HDFS_PRINCIPAL, args[1]);
    +        }
    +        if(args.length >= 3) {
    +        	System.out.println("SETTING HDFS KEYTAB!");
    +        	hdfsConf.put(Config.BLOBSTORE_HDFS_KEYTAB, args[2]);
    +        }
    +        
    +        /* CREATE THE BLOBSTORES */
    +        System.out.println("Creating HDFS blobstore.");
    +        HdfsBlobStore hdfsBlobStore = new HdfsBlobStore();
    +        hdfsBlobStore.prepare(hdfsConf, null, null);
    +        System.out.println("Done.");
    --- End diff --
    
    I'm not sure that users by default will expect the 3rd Done to mean we really are done.  Could you make the messages a bit more descriptive? or remove them?
    
    This goes for listing the local blobs as well.


---

[GitHub] storm pull request #2347: STORM-2760: Add Blobstore Migration Scripts

Posted by knusbaum <gi...@git.apache.org>.
Github user knusbaum commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2347#discussion_r141420947
  
    --- Diff: external/storm-blobstore-migration/src/main/java/org/apache/storm/blobstore/ListHDFS.java ---
    @@ -0,0 +1,52 @@
    +package org.apache.storm.blobstore;
    +
    +import java.util.Map;
    +
    +import javax.security.auth.Subject;
    +
    +import org.apache.storm.Config;
    +import org.apache.storm.blobstore.ClientBlobStore;
    +import org.apache.storm.hdfs.blobstore.HdfsBlobStore;
    +import org.apache.storm.hdfs.blobstore.HdfsClientBlobStore;
    +import org.apache.storm.utils.Utils;
    +
    +public class ListHDFS {
    +    
    +    public static void main(String[] args) throws Exception {
    --- End diff --
    
    [STORM-2763](https://issues.apache.org/jira/browse/STORM-2763)


---

[GitHub] storm pull request #2347: STORM-2760: Add Blobstore Migration Scripts

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/storm/pull/2347


---

[GitHub] storm pull request #2347: STORM-2760: Add Blobstore Migration Scripts

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2347#discussion_r141183305
  
    --- Diff: external/storm-blobstore-migration/src/main/java/org/apache/storm/blobstore/ListHDFS.java ---
    @@ -0,0 +1,52 @@
    +package org.apache.storm.blobstore;
    --- End diff --
    
    Needs a license header.


---

[GitHub] storm pull request #2347: STORM-2760: Add Blobstore Migration Scripts

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2347#discussion_r141183394
  
    --- Diff: external/storm-blobstore-migration/pom.xml ---
    @@ -0,0 +1,126 @@
    +<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    --- End diff --
    
    needs a license header (all of the files do)


---

[GitHub] storm pull request #2347: STORM-2760: Add Blobstore Migration Scripts

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2347#discussion_r141184831
  
    --- Diff: external/storm-blobstore-migration/src/main/java/org/apache/storm/blobstore/MigrateBlobs.java ---
    @@ -0,0 +1,125 @@
    +package org.apache.storm.blobstore;
    +
    +import java.io.IOException;
    +import java.io.InputStream;
    +import java.util.Map;
    +
    +import javax.security.auth.Subject;
    +import javax.security.auth.login.LoginContext;
    +
    +import org.apache.storm.Config;
    +import org.apache.storm.blobstore.BlobStore;
    +import org.apache.storm.hdfs.blobstore.HdfsBlobStore;
    +import org.apache.storm.nimbus.NimbusInfo;
    +import org.apache.storm.utils.Utils;
    +import org.apache.storm.blobstore.LocalFsBlobStore;
    +import org.apache.storm.generated.AuthorizationException;
    +import org.apache.storm.generated.KeyAlreadyExistsException;
    +import org.apache.storm.generated.KeyNotFoundException;
    +import org.apache.storm.generated.ReadableBlobMeta;
    +import org.apache.storm.generated.SettableBlobMeta;
    +
    +public class MigrateBlobs {
    +    
    +    protected static void deleteAllBlobStoreKeys(BlobStore bs, Subject who) throws AuthorizationException, KeyNotFoundException {
    +        Iterable<String> hdfsKeys = () -> bs.listKeys();
    +        for(String key : hdfsKeys) {
    +            System.out.println(key);
    +            bs.deleteBlob(key, who);
    +        }
    +    }
    +    
    +    protected static void copyBlobStoreKeys(BlobStore bsFrom, Subject whoFrom, BlobStore bsTo, Subject whoTo) throws AuthorizationException, KeyAlreadyExistsException, IOException, KeyNotFoundException {
    +        Iterable<String> lfsKeys = () -> bsFrom.listKeys();
    +        for(String key : lfsKeys) {
    +            ReadableBlobMeta readable_meta = bsFrom.getBlobMeta(key, whoFrom);
    +            SettableBlobMeta meta = readable_meta.get_settable();
    +            InputStream in = bsFrom.getBlob(key, whoFrom);
    +            System.out.println("COPYING BLOB " + key + " FROM " + bsFrom + " TO " + bsTo);
    +            bsTo.createBlob(key, in, meta, whoTo);
    +            System.out.println("DONE CREATING BLOB " + key);
    +        }
    +    }
    +    
    +    
    +    public static void main(String[] args) throws Exception {
    +        // TODO Auto-generated method stub
    --- End diff --
    
    Please remove this comment.


---

[GitHub] storm pull request #2347: STORM-2760: Add Blobstore Migration Scripts

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2347#discussion_r141182693
  
    --- Diff: external/storm-blobstore-migration/README.md ---
    @@ -0,0 +1,87 @@
    +# Blobstore Migrator
    +
    +## Basic Use
    +-----
    +
    +### Build The Thing
    +Use make to build a tarball with everything needed.
    +```
    +$ make
    +```
    +
    +### Use The Thing
    +Copy and extract the tarball
    +```
    +$ scp blobstore-migrator.tgz my-nimbus-host.example.com:~/
    +$ ssh my-nimbus-host.example.com
    +... On my-nimbus-host ...
    +$ tar -xvzf blobstore-migrator.tgz
    +```
    +
    +This will expand into a blobstore-migrator directory with all the scripts and the jar.
    +```
    +$ cd blobstore-migrator
    +$ ls
    +blobstore-migrator-2.0.jar listHDFS.sh  listLocal.sh  migrate.sh
    +```
    +
    +To run, first create a config for the cluster.
    +The config must be named 'config'
    +It must contain definitions for `HDFS_BLOBSTORE_DIR`, `LOCAL_BLOBSTORE_DIR`, and `HADOOP_CLASSPATH`.
    +Hadoop jars are packaged with neither storm nor this package, so they must be installed separately.
    +
    +Optional configs used to configure security are: `BLOBSTORE_PRINCIPAL`, `KEYTAB_FILE`, and `JAAS_CONF`
    +
    +Example:
    +```
    +$ cat config
    +HDFS_BLOBSTORE_DIR='hdfs://some-hdfs-namenode:8080/srv/storm/my-storm-blobstore'
    +LOCAL_BLOBSTORE_DIR='/srv/storm'
    +HADOOP_CLASSPATH='/hadoop/share/hdfs/*:/hadoop/common/*'
    +
    +# My security configs: 
    +BLOBSTORE_PRINCIPAL='stormUser/my-nimbus-host.example.com@STORM.EXAMPLE.COM'
    +KEYTAB_FILE='/srv/my-keytab/stormUser.kt'
    +JAAS_CONF='/storm/conf/storm_jaas.conf'
    +```
    +
    +Now you can run any of the scripts, all of which require config to exist:
    + - listHDFS.sh: lists all blobs currently in the HDFS Blobstore
    + - listLocal.sh: lists all blobs currently in the local Blobstore
    + - migrate.sh: Begins the migration process for Nimbus. (Read instructions below first)
    + 
    + 
    +#### Migrating
    +##### Nimbus
    +To migrate blobs from nimbus, the following steps are necessary:
    +
    +1. Shut down Nimbus
    --- End diff --
    
    Can we make this shut down all nimbus instances, because of HA we cannot leave any up.


---

[GitHub] storm pull request #2347: STORM-2760: Add Blobstore Migration Scripts

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2347#discussion_r141185443
  
    --- Diff: external/storm-blobstore-migration/src/test/java/org/apache/storm/blobstore/AppTest.java ---
    @@ -0,0 +1,38 @@
    +package org.apache.storm.blobstore;
    +
    +import junit.framework.Test;
    +import junit.framework.TestCase;
    +import junit.framework.TestSuite;
    +
    +/**
    + * Unit test for simple App.
    + */
    +public class AppTest 
    --- End diff --
    
    Can we remove this file.  Having a bogus test is worse then no test at all.


---

[GitHub] storm pull request #2347: STORM-2760: Add Blobstore Migration Scripts

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2347#discussion_r141183179
  
    --- Diff: external/storm-blobstore-migration/pom.xml ---
    @@ -0,0 +1,126 @@
    +<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    +         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
    +    <modelVersion>4.0.0</modelVersion>
    +
    +    <parent>
    +        <artifactId>storm</artifactId>
    +        <groupId>org.apache.storm</groupId>
    +        <version>2.0.0-SNAPSHOT</version>
    +        <relativePath>../../pom.xml</relativePath>
    +    </parent>
    +    
    +    <artifactId>blobstore-migrator</artifactId>
    +    <packaging>jar</packaging>
    +    
    +    <name>blobstore-migrator</name>
    +    <url>http://maven.apache.org</url>
    +    <dependencies>
    +        <dependency>
    +            <groupId>org.apache.storm</groupId>
    +            <artifactId>storm-server</artifactId>
    +            <version>${project.version}</version>
    +            <exclusions>
    +                <!--log4j-over-slf4j must be excluded for hadoop-minicluster
    +                    see: http://stackoverflow.com/q/20469026/3542091 -->
    +                <exclusion>
    +                    <groupId>org.slf4j</groupId>
    +                    <artifactId>log4j-over-slf4j</artifactId>
    +                </exclusion>
    +            </exclusions>
    +        </dependency>
    +        <dependency>
    +            <groupId>org.apache.storm</groupId>
    +            <artifactId>storm-hdfs</artifactId>
    +            <version>${project.version}</version>
    +            <exclusions>
    +                <!--log4j-over-slf4j must be excluded for hadoop-minicluster
    +                    see: http://stackoverflow.com/q/20469026/3542091 -->
    +                <exclusion>
    +                    <groupId>org.slf4j</groupId>
    +                    <artifactId>log4j-over-slf4j</artifactId>
    +                </exclusion>
    +            </exclusions>
    +        </dependency>
    +        <!-- <dependency> -->
    +        <!--     <artifactId>storm-core</artifactId> -->
    +        <!--     <groupId>org.apache.storm</groupId> -->
    +        <!--     <version>0.10.2.y</version> -->
    +        <!-- </dependency> -->
    --- End diff --
    
    Can we remove the commented out code?


---

[GitHub] storm pull request #2347: STORM-2760: Add Blobstore Migration Scripts

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2347#discussion_r141182276
  
    --- Diff: external/storm-blobstore-migration/Makefile ---
    @@ -0,0 +1,19 @@
    +PACKAGE_NAME=blobstore-migrator.tgz
    +
    +VERSION=$(shell cat VERSION || mvn help:evaluate -Dexpression=project.version | grep -v '^\[')
    +
    +all: $(PACKAGE_NAME)
    +
    +$(PACKAGE_NAME) : VERSION target/blobstore-migrator-$(VERSION).jar
    +	-@rm -Rf blobstore-migrator $(PACKAGE_NAME)
    +	mkdir blobstore-migrator
    +	cp target/blobstore-migrator-$(VERSION).jar blobstore-migrator/
    +	cp listHDFS.sh listLocal.sh migrate.sh VERSION blobstore-migrator/
    +	tar -cvzf $(PACKAGE_NAME) blobstore-migrator
    +#	rm -Rf blobstore-migrator
    --- End diff --
    
    Can we delete the comment?


---