You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Gopal V (JIRA)" <ji...@apache.org> on 2014/02/28 21:46:19 UTC

[jira] [Updated] (HIVE-6524) Update ORC Filedump stripe sizes to match the memory manager changes

     [ https://issues.apache.org/jira/browse/HIVE-6524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gopal V updated HIVE-6524:
--------------------------

    Attachment: HIVE-6524.1-tez.patch

{code}
 Stripes:
-  Stripe: offset: 3 data: 102311 rows: 4000 tail: 68 index: 224
+  Stripe: offset: 3 data: 144733 rows: 5000 tail: 68 index: 235
{code}

And everything else changes because of the first stripe being 5k rows.

A previous 21k orc writer was causing a leak into the next file, which ended up with 4k rows for 1st stream instead of the full 5k.

> Update ORC Filedump stripe sizes to match the memory manager changes
> --------------------------------------------------------------------
>
>                 Key: HIVE-6524
>                 URL: https://issues.apache.org/jira/browse/HIVE-6524
>             Project: Hive
>          Issue Type: Bug
>          Components: Tests
>    Affects Versions: tez-branch
>            Reporter: Gopal V
>            Assignee: Gopal V
>            Priority: Minor
>             Fix For: tez-branch
>
>         Attachments: HIVE-6524.1-tez.patch
>
>
> The MemoryManager in ORC now resets to default whenever we close all open writers. 
> This results in consistent (but different from test golden) stripe sizes for all files being written.
> Fix the goldens.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)