You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tajo.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2014/11/18 10:37:33 UTC

[jira] [Commented] (TAJO-1033) Implement a FileAppender for ElasticSearch.

    [ https://issues.apache.org/jira/browse/TAJO-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215987#comment-14215987 ] 

ASF GitHub Bot commented on TAJO-1033:
--------------------------------------

GitHub user blrunner opened a pull request:

    https://github.com/apache/tajo/pull/251

    TAJO-1033: Implement a FileAppender for ElasticSearch.

    I implemente ElasticSearchAppender. You can load tajo table data to elasticsearch index as follows:
    
     1. Create an elasticsearch index and index type.
    ex) index: radio, index type: artitists
    
     2. Create a tajo for match to an elasticsearch index. For reference, you must use elasticsearch for storage type.
    ex) CREATE TABLE estable1 (id int, name text, score float, type text)
    STORED AS elasticsearch
    with ('elasticsearch.cluster'='elasticsearch',
    'elasticsearch.nodes'='localhost:9300',
    'elasticsearch.resources'='radio/artists',
    'elasticsearch.replication'='1',
    'elasticsearch.bulk.item.size'='1000');
    
     3. Load tajo data to an elasticsearch index with sql.
    ex) insert overwrite into estable1 select * from table1;
    
    This patch is not complete.I'll write a documentation and implement a few methods to improve performance. Also we need to allow for users to change last task numbers. Because lots of tasks for storage will cause law performance of an elasticsearch cluster.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/blrunner/tajo TAJO-1033

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/tajo/pull/251.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #251
    
----
commit 088004a37a82bbe0ab976a56b87b0aba86c08b09
Author: JaeHwa Jung <bl...@apache.org>
Date:   2014-11-17T06:07:37Z

    Initial Commit

commit c0a25ac57839fb3b37d5c93a2423596809b33dec
Author: JaeHwa Jung <bl...@apache.org>
Date:   2014-11-17T06:16:21Z

    Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo into TAJO-1033
    
    Conflicts:
    	tajo-catalog/tajo-catalog-common/src/main/java/org/apache/tajo/catalog/CatalogUtil.java
    	tajo-catalog/tajo-catalog-common/src/main/proto/CatalogProtos.proto
    	tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/PhysicalPlanUtil.java
    	tajo-storage/pom.xml
    	tajo-storage/src/main/resources/storage-default.xml
    	tajo-storage/src/test/resources/storage-default.xml

commit 167b220034a7373c88cdbc60b6baedd9ce79f1c8
Author: JaeHwa Jung <bl...@apache.org>
Date:   2014-11-17T11:56:56Z

    Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo into TAJO-1033

commit 806e04a3153784df87afe6872696c52214bd8c51
Author: JaeHwa Jung <bl...@apache.org>
Date:   2014-11-18T09:04:40Z

    TAJO-1033: Implement a FileAppender for ElasticSearch.

commit 2236cb2822d8ad30f56e3125b2aa108971e3f068
Author: JaeHwa Jung <bl...@apache.org>
Date:   2014-11-18T09:07:30Z

    Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo into TAJO-1033
    
    Conflicts:
    	tajo-storage/src/main/resources/storage-default.xml
    	tajo-storage/src/test/resources/storage-default.xml

----


> Implement a FileAppender for ElasticSearch.
> -------------------------------------------
>
>                 Key: TAJO-1033
>                 URL: https://issues.apache.org/jira/browse/TAJO-1033
>             Project: Tajo
>          Issue Type: New Feature
>          Components: storage
>            Reporter: Jaehwa Jung
>            Assignee: Jaehwa Jung
>         Attachments: ImplementanappenderforElasticSearch..pdf
>
>
> ElasticSearch(ES) is a search server based on Lucene. It provides a distributed, multitenant-capable full-text search engine with aRESTful web interface and schema-free JSON documents. Elasticsearch is developed in Java and is released as open source under the terms of the Apache License. 
> I think ES is a very powerful solution for serving data, for example, for serving Tajo query results. Currently, using APIs ES provides, users can load data stored in HDFS into ES indices. But to do that, users have to implement a program by themselves, which requires them to know HDFS, ES, and some languages such as Java properly. From my research to find a way to make their lives easier, I found that Tajo can help them by providing Tajo appender for ES with which users can index data in ES using SQL in Tajo. And I believe this feature will help to use Tajo better leveraging its ecosystem.
> For more information, please refer to the attached file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)