You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hive.apache.org by "Zheng Shao (JIRA)" <ji...@apache.org> on 2009/08/24 23:51:59 UTC

[jira] Created: (HIVE-787) Hive Freeway - support near-realtime data processing

Hive Freeway - support near-realtime data processing
----------------------------------------------------

                 Key: HIVE-787
                 URL: https://issues.apache.org/jira/browse/HIVE-787
             Project: Hadoop Hive
          Issue Type: New Feature
            Reporter: Zheng Shao


Most people are using Hive for daily (or at most hourly) data processing.
We want to explore what are the obstacles for using Hive for 15 minutes, 5 minutes or even 1 minute data processing intervals, and remove these obstacles.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-787) Hive Freeway - support near-realtime data processing

Posted by "Edward Capriolo (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HIVE-787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12747440#action_12747440 ] 

Edward Capriolo commented on HIVE-787:
--------------------------------------

My previous comment was unclear. Now that Hadoop has append support, it would be nice to be able to write directly from a Hadoop Job doing data ingestion directly to a hive table. In my case, I am pulling files info DFS to be later added with 'add file'. It would be nice if I had a HiveOutputFormat and i could emit() data to hive.

> Hive Freeway - support near-realtime data processing
> ----------------------------------------------------
>
>                 Key: HIVE-787
>                 URL: https://issues.apache.org/jira/browse/HIVE-787
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Zheng Shao
>
> Most people are using Hive for daily (or at most hourly) data processing.
> We want to explore what are the obstacles for using Hive for 15 minutes, 5 minutes or even 1 minute data processing intervals, and remove these obstacles.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-787) Hive Freeway - support near-realtime data processing

Posted by "Jeff Hammerbacher (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HIVE-787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12926192#action_12926192 ] 

Jeff Hammerbacher commented on HIVE-787:
----------------------------------------

More details on the Data Freeway implementation at Facebook: http://vimeo.com/15337985

> Hive Freeway - support near-realtime data processing
> ----------------------------------------------------
>
>                 Key: HIVE-787
>                 URL: https://issues.apache.org/jira/browse/HIVE-787
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Zheng Shao
>
> Most people are using Hive for daily (or at most hourly) data processing.
> We want to explore what are the obstacles for using Hive for 15 minutes, 5 minutes or even 1 minute data processing intervals, and remove these obstacles.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-787) Hive Freeway - support near-realtime data processing

Posted by "Edward Capriolo (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HIVE-787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12747157#action_12747157 ] 

Edward Capriolo commented on HIVE-787:
--------------------------------------

What about some scribe like feature where the MAP phase or REDUCE phase can write/append to hive table and partition

> Hive Freeway - support near-realtime data processing
> ----------------------------------------------------
>
>                 Key: HIVE-787
>                 URL: https://issues.apache.org/jira/browse/HIVE-787
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Zheng Shao
>
> Most people are using Hive for daily (or at most hourly) data processing.
> We want to explore what are the obstacles for using Hive for 15 minutes, 5 minutes or even 1 minute data processing intervals, and remove these obstacles.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.