You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Gopal V (JIRA)" <ji...@apache.org> on 2016/02/26 01:36:18 UTC
[jira] [Updated] (HIVE-13161) ORC: Always do sloppy overlaps for
DiskRanges
[ https://issues.apache.org/jira/browse/HIVE-13161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gopal V updated HIVE-13161:
---------------------------
Component/s: ORC
> ORC: Always do sloppy overlaps for DiskRanges
> ---------------------------------------------
>
> Key: HIVE-13161
> URL: https://issues.apache.org/jira/browse/HIVE-13161
> Project: Hive
> Issue Type: Bug
> Components: ORC
> Affects Versions: 1.3.0, 2.1.0
> Reporter: Gopal V
> Assignee: Prasanth Jayachandran
>
> The selected columns are sometimes only a few bytes apart (particularly for nulls which compresses tightly) and the reads aren't merged
> The WORST_UNCOMPRESSED_SLOP is only applied in the PPD case and is applied more for safety than reducing total number of round-trip calls to filesystem.
> {code}
> /**
> * Update the disk ranges to collapse adjacent or overlapping ranges. It
> * assumes that the ranges are sorted.
> * @param ranges the list of disk ranges to merge
> */
> static void mergeDiskRanges(List<DiskRange> ranges) {
> DiskRange prev = null;
> for(int i=0; i < ranges.size(); ++i) {
> DiskRange current = ranges.get(i);
> if (prev != null && overlap(prev.offset, prev.end,
> current.offset, current.end)) {
> prev.offset = Math.min(prev.offset, current.offset);
> prev.end = Math.max(prev.end, current.end);
> ranges.remove(i);
> i -= 1;
> } else {
> prev = current;
> }
> }
> }
> ...
> private static boolean overlap(long leftA, long rightA, long leftB, long rightB) {
> if (leftA <= leftB) {
> return rightA >= leftB;
> }
> return rightB >= leftA;
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)