You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org> on 2010/08/17 02:55:19 UTC
[jira] Updated: (PIG-1205) Enhance HBaseStorage-- Make it support
loading row key and implement StoreFunc
[ https://issues.apache.org/jira/browse/PIG-1205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dmitriy V. Ryaboy updated PIG-1205:
-----------------------------------
Attachment: PIG_1205_5.path
This patch (not really review-ready yet) introduces the Elephant-Bird improvements.
You can use -gt, -gte, -lt, -lte flags to filter out row ranges, specify caching and per-region row limits, and you can specify the caster to use (interpret Strings, as before, or use bytes directly for more eficient storage and communication).
The filtering is a bit off because it still spins up all the map tasks, the ones whose keys are filtered out just finish extremely fast.
The progress reporting is a bit jittery, but better than nothing.
TODO: fix up filtering, add projection pushdown, add filter pushdown, and write better tests.
> Enhance HBaseStorage-- Make it support loading row key and implement StoreFunc
> ------------------------------------------------------------------------------
>
> Key: PIG-1205
> URL: https://issues.apache.org/jira/browse/PIG-1205
> Project: Pig
> Issue Type: Sub-task
> Affects Versions: 0.7.0
> Reporter: Jeff Zhang
> Assignee: Dmitriy V. Ryaboy
> Fix For: 0.8.0
>
> Attachments: PIG_1205.patch, PIG_1205_2.patch, PIG_1205_3.patch, PIG_1205_4.patch, PIG_1205_5.path
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.