You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tajo.apache.org by hys9958 <gi...@git.apache.org> on 2014/12/01 08:18:24 UTC

[GitHub] tajo pull request: TAJO-1199: EMR bootstrap script for Tajo

GitHub user hys9958 opened a pull request:

    https://github.com/apache/tajo/pull/275

    TAJO-1199: EMR bootstrap script for Tajo

    Bootstrap Action Arguments:
    ==========================
    
    Usage: install-tajo.sh [OPTIONS]
    
        -t [S3_PATH_TO_TAJO_BIN_TARBALL]
           Ex: s3://[your_bucket]/[your_path]/tajo-{version}.tar.gz
           Default: http://d3kp3z3ppbkcio.cloudfront.net/tajo-0.9.0/tajo-0.9.0.tar.gz
        -c [S3_PATH_TO_TAJO_CONF_DIR] 
           Ex: s3://[your_bucket]/[your_path]/conf
        -l [S3_PATH_TO_THIRD_PARTY_JARS_DIR]
           Ex: s3://[your_bucket]/[your_path]/lib
        -h
           Display help message
        -T [LOCAL_PATH_TO_TEST_ROOT] (only used for local test)
           Ex: /[LOCAL_PATH_TO_TEST_ROOT]
        -H [LOCAL_PATH_TO_HADOOP_HOME_FOR_TEST] (only used for local test)
           Ex: /[LOCAL_PATH_TO_HADOOP_HOME_FOR_TEST]
    
    Note that all arguments are optional. ``-T`` and ``-H`` are only for local test.
    
    Sample Commands:
    ================
    
    Launching a Tajo cluster with a default configurations
    -------------------------------------------------------
     * It uses EMR HDFS as ```tajo.root``` which includes the warehouse directory
     * It uses all default heap and concurrency configs.
     * It is good for a simple test. 
     
    ```
    $ aws emr create-cluster    \
    	--name="[CLUSTER_NAME]"  \
    	--ami-version=3.3        \
    	--ec2-attributes KeyName=[KEY_FIAR_NAME] \
    	--instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m3.xlarge InstanceGroupType=CORE,InstanceCount=1,InstanceType=c3.xlarge \
    	--bootstrap-action Name="Install tajo",Path=s3://[your_bucket]/[your_path]/install-tajo.sh
    ```
    
    Launching a Tajo cluster with additional configurations
    -------------------------------------------------------
    
     * To use your Tajo tarball, you should use ```-t``` to specify S3 URL.
     * To change ```tajo.rootdir```, you should make your own ```tajo-site.xml``` and use ```-c``` option to specify S3 URL for config dirs.
       * You can find appropriate config templates in tajo-emr/template.
     * To use RDS, you needs appropriate JDBC jars like mysql-connector.jar. ```-l``` option allows you to specify S3 directory URL, including third party Jars.
    
     
    ```
        aws emr create-cluster \
        --name="[CLUSTER_NAME]" \
        --ami-version=3.3 \
        --ec2-attributes KeyName=[KEY_FIAR_NAME] \
        --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m3.xlarge InstanceGroupType=CORE,InstanceCount=1,InstanceType=c3.xlarge \
        --bootstrap-action Name="Install tajo",Path=s3://[your_bucket]/[your_path]/install-tajo.sh,Args=["-t","s3://[your_bucket]/tajo-0.9.0.tar.gz","-c","s3://[your_bucket]/conf","-l","s3://[your_bucket]/lib"]
    ```
    
    How to test bootstrap in local machine
    =======================================
    ```install-tajo.sh``` allows users to test the bootstrap in local machine without EMR instances. For it, you need to use ```-T``` and ```-H``` options.
     * ```-T``` - Testing root dir which is temporarily used for testing.
     * ```-H``` - Hadoop binary directory which is used to pretended to be EMR Hadoop home
    
    ```   
    $ ./install-EMR-tajo.sh -t /[your_local_binary_path]/tajo-0.9.0.tar.gz -c /[your_test_conf_dir]/conf -l /[your_test_lib_dir]/lib -T /[LOCAL_PATH_TO_TEST_ROOT] -H /[LOCAL_PATH_TO_HADOOP_HOME_FOR_TEST]
    ```
    
    Running with AWS RDS
    ====================
    Tajo can use RDS. For it, you need to make sure you already have a running RDS instance. Then, you need to make your ```catalog-site.xml```. Please refer to [Catalog configuration documentation] (http://tajo.apache.org/docs/current/configuration/catalog_configuration.html) in Tajo doc.
    
    Also, you should use ```-c``` option in order to use your custom ```catalog-site.xml``` file.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/hys9958/tajo tajo-1199

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/tajo/pull/275.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #275
    
----
commit 0b4b135c81ca3548e78d622c26027808883b9c9f
Author: hys9958 <ha...@gmail.com>
Date:   2014-12-01T07:06:43Z

    TAJO-1199: EMR bootstrap script for Tajo

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-1199: EMR bootstrap script for Tajo

Posted by hys9958 <gi...@git.apache.org>.
Github user hys9958 closed the pull request at:

    https://github.com/apache/tajo/pull/275


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-1199: EMR bootstrap script for Tajo

Posted by hyunsik <gi...@git.apache.org>.
Github user hyunsik commented on the pull request:

    https://github.com/apache/tajo/pull/275#issuecomment-66257171
  
    Hi @hys9958, 
    
    It's a really great job. I tested it on EMR several times. It works well, and it was very convenient. I'm very happy to see this work.
    
    It would be good if this script is merged to Tajo repository. BTW, it would be the best if it is submitted to github repository directly managed by Amazon (https://github.com/awslabs/emr-bootstrap-actions). If so, Amazon will provide the script and Tajo release tarball on their S3.
    
    If you are Ok, I can help you submit the patch to Amazon's repo.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-1199: EMR bootstrap script for Tajo

Posted by hys9958 <gi...@git.apache.org>.
Github user hys9958 commented on the pull request:

    https://github.com/apache/tajo/pull/275#issuecomment-66416294
  
    Ok~
    Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---