You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Rohini Palaniswamy (JIRA)" <ji...@apache.org> on 2019/06/24 17:31:00 UTC

[jira] [Updated] (PIG-4251) Pig on Storm

     [ https://issues.apache.org/jira/browse/PIG-4251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rohini Palaniswamy updated PIG-4251:
------------------------------------
    Status: Open  (was: Patch Available)

Canceling Patch Available status as we don't have further plans to proceed with this.

> Pig on Storm
> ------------
>
>                 Key: PIG-4251
>                 URL: https://issues.apache.org/jira/browse/PIG-4251
>             Project: Pig
>          Issue Type: Improvement
>          Components: tools
>    Affects Versions: 0.13.0
>            Reporter: Mridul Jain
>            Priority: Major
>         Attachments: pig_on_storm.patch
>
>
> This is a jira to submit the patch where we propose, PIG as the primary language for expressing realtime stream processing logic and provide a working prototype on Storm. This includes running the existing PIG UDFs, seamlessly on Storm. Though PIG or Storm do not take any position on state, this system also provides built-in support for advanced state semantics like sliding windows, global mutable state etc, which are required in real world applications. 
> Associated talk and slides: 
> https://www.youtube.com/watch?v=fd-I5EtxSuI&feature=youtu.be
> http://www.slideshare.net/Hadoop_Summit/t-435p230-cjain
> -----------
> Setup
> -----------
> Prereq.
>     Storm and Hadoop23 are installed on the host
> - Unpack PigOnStorm tarball (It packs pig as well)
> - export PATH=<storm_home>/bin:<hadoop_home>/bin:$PATH
> - export JAVA_HOME=<java_home>
> - export PIG_HOME=<directory for PigOnStorm installation>
> - export PIG_CLASSPATH=<storm_home>/conf:<hadoop_home>/conf (to get the storm configuration, e.g. nimbus host, hadoop site.xml etc) 
> - “pig -x storm_local” to launch pig with Storm Local mode (No Storm remote cluster needed) 
> - “pig -x storm” to launch pig with Remote storm cluster
> Use Hbase
>     - export export HBASE_HOME=<hbase_home_dir>
> -----------
> Available UDFs
> -----------
> - File loader 
> load '/tmp/props' using org.apache.pig.backend.hadoop.executionengine.storm.emitters.POSFileEmitter;
> - Hbase WindowStore
>     store A into 'testTable' using org.apache.pig.backend.hadoop.executionengine.storm.state.hbase.WindowHbaseStore('fam');
> - Hbase WindowLoad
>     load 'testTable,-10,-1' using org.apache.pig.backend.hadoop.executionengine.storm.state.hbase.WindowHbaseStore('fam');



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)