You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org> on 2010/08/17 02:55:19 UTC

[jira] Updated: (PIG-1205) Enhance HBaseStorage-- Make it support loading row key and implement StoreFunc

     [ https://issues.apache.org/jira/browse/PIG-1205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dmitriy V. Ryaboy updated PIG-1205:
-----------------------------------

    Attachment: PIG_1205_5.path

This patch (not really review-ready yet) introduces the Elephant-Bird improvements.

You can use -gt, -gte, -lt, -lte flags to filter out row ranges, specify caching and per-region row limits, and you can specify the caster to use (interpret Strings, as before, or use bytes directly for more eficient storage and communication).

The filtering is a bit off because it still spins up all the map tasks, the ones whose keys are filtered out just finish extremely fast. 

The progress reporting is a bit jittery, but better than nothing.

TODO: fix up filtering, add projection pushdown, add filter pushdown, and write better tests.



> Enhance HBaseStorage-- Make it support loading row key and implement StoreFunc
> ------------------------------------------------------------------------------
>
>                 Key: PIG-1205
>                 URL: https://issues.apache.org/jira/browse/PIG-1205
>             Project: Pig
>          Issue Type: Sub-task
>    Affects Versions: 0.7.0
>            Reporter: Jeff Zhang
>            Assignee: Dmitriy V. Ryaboy
>             Fix For: 0.8.0
>
>         Attachments: PIG_1205.patch, PIG_1205_2.patch, PIG_1205_3.patch, PIG_1205_4.patch, PIG_1205_5.path
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.