You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Hadoop QA (JIRA)" <ji...@apache.org> on 2009/12/10 19:56:18 UTC

[jira] Commented: (PIG-1117) Pig reading hive columnar rc tables

    [ https://issues.apache.org/jira/browse/PIG-1117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788842#action_12788842 ] 

Hadoop QA commented on PIG-1117:
--------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12427606/PIG-1117.patch
  against trunk revision 888852.

    -1 @author.  The patch appears to contain 1 @author tags which the Pig community has agreed to not allow in code contributions.

    +1 tests included.  The patch appears to include 5 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    -1 release audit.  The applied patch generated 393 release audit warnings (more than the trunk's current 391 warnings).

    -1 core tests.  The patch failed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/113/testReport/
Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/113/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/113/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/113/console

This message is automatically generated.

> Pig reading hive columnar rc tables
> -----------------------------------
>
>                 Key: PIG-1117
>                 URL: https://issues.apache.org/jira/browse/PIG-1117
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Gerrit Jansen van Vuuren
>         Attachments: HiveColumnarLoader.patch, HiveColumnarLoaderTest.patch, PIG-1117.patch
>
>
> I've coded a LoadFunc implementation that can read from Hive Columnar RC tables, this is needed for a project that I'm working on because all our data is stored using the Hive thrift serialized Columnar RC format. I have looked at the piggy bank but did not find any implementation that could do this. We've been running it on our cluster for the last week and have worked out most bugs.
>  
> There are still some improvements to be done but I would need  like setting the amount of mappers based on date partitioning. Its been optimized so as to read only specific columns and can churn through a data set almost 8 times faster with this improvement because not all column data is read.
> I would like to contribute the class to the piggybank can you guide me in what I need to do?
> I've used hive specific classes to implement this, is it possible to add this to the piggy bank build ivy for automatic download of the dependencies?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.