You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Hive QA (JIRA)" <ji...@apache.org> on 2015/06/17 20:26:00 UTC
[jira] [Commented] (HIVE-11033) BloomFilter index is not honored by ORC reader

    [ https://issues.apache.org/jira/browse/HIVE-11033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14590261#comment-14590261 ] 

Hive QA commented on HIVE-11033:
--------------------------------



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12740053/HIVE-11033.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9008 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_corr
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join28
org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery
{noformat}

Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4287/testReport
Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4287/console
Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4287/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12740053 - PreCommit-HIVE-TRUNK-Build

> BloomFilter index is not honored by ORC reader
> ----------------------------------------------
>
>                 Key: HIVE-11033
>                 URL: https://issues.apache.org/jira/browse/HIVE-11033
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 1.2.0
>            Reporter: Allan Yan
>            Assignee: Prasanth Jayachandran
>         Attachments: HIVE-11033.patch
>
>
> There is a bug in the org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl class which caused the bloom filter index saved in the ORC file not being used. The root cause is the bloomFilterIndices variable defined in the SargApplier class superseded the one defined in its parent class. Therefore, in the ReaderImpl.pickRowGroups()
> {code}
>   protected boolean[] pickRowGroups() throws IOException {
>     // if we don't have a sarg or indexes, we read everything
>     if (sargApp == null) {
>       return null;
>     }
>     readRowIndex(currentStripe, included, sargApp.sargColumns);
>     return sargApp.pickRowGroups(stripes.get(currentStripe), indexes);
>   }
> {code}
> The bloomFilterIndices populated by readRowIndex() is not picked up by sargApp object. One solution is to make SargApplier.bloomFilterIndices a reference to its parent counterpart.
> {noformat}
> 18:46 $ diff src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java.original
> 174d173
> <     bloomFilterIndices = new OrcProto.BloomFilterIndex[types.size()];
> 178c177
> <           sarg, options.getColumnNames(), strideRate, types, included.length, bloomFilterIndices);
> ---
> >           sarg, options.getColumnNames(), strideRate, types, included.length);
> 204a204
> >     bloomFilterIndices = new OrcProto.BloomFilterIndex[types.size()];
> 673c673
> <         List<OrcProto.Type> types, int includedCount, OrcProto.BloomFilterIndex[] bloomFilterIndices) {
> ---
> >         List<OrcProto.Type> types, int includedCount) {
> 677c677
> <       this.bloomFilterIndices = bloomFilterIndices;
> ---
> >       bloomFilterIndices = new OrcProto.BloomFilterIndex[types.size()];
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)