You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Mostafa Mokhtar (JIRA)" <ji...@apache.org> on 2014/09/05 02:21:24 UTC
[jira] [Updated] (HIVE-7990) With fetch column stats disabled
number of elements in grouping set is not taken into account
[ https://issues.apache.org/jira/browse/HIVE-7990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mostafa Mokhtar updated HIVE-7990:
----------------------------------
Component/s: (was: File Formats)
Statistics
> With fetch column stats disabled number of elements in grouping set is not taken into account
> ---------------------------------------------------------------------------------------------
>
> Key: HIVE-7990
> URL: https://issues.apache.org/jira/browse/HIVE-7990
> Project: Hive
> Issue Type: Bug
> Components: Statistics
> Affects Versions: 0.14.0
> Environment: Loading into orc
> Reporter: Mostafa Mokhtar
> Assignee: Prasanth J
> Labels: performance
> Fix For: 0.14.0
>
>
> When loading into an un-paritioned ORC table WriterImpl$StructTreeWriter.write method is synchronized.
> When hive.optimize.sort.dynamic.partition is enabled the current thread will be the only writer and the synchronization is not needed.
> Also checking for memory per row is an over kill , this can be done per 1K rows or such
> {code}
> public void addRow(Object row) throws IOException {
> synchronized (this) {
> treeWriter.write(row);
> rowsInStripe += 1;
> if (buildIndex) {
> rowsInIndex += 1;
> if (rowsInIndex >= rowIndexStride) {
> createRowIndexEntry();
> }
> }
> }
> memoryManager.addedRow();
> }
> {code}
> This can improve ORC load performance by 7%
> {code}
> Stack Trace Sample Count Percentage(%)
> WriterImpl.addRow(Object) 5,852 65.782
> WriterImpl$StructTreeWriter.write(Object) 5,163 58.037
> MemoryManager.addedRow() 666 7.487
> MemoryManager.notifyWriters() 648 7.284
> WriterImpl.checkMemory(double) 645 7.25
> WriterImpl.flushStripe() 643 7.228
> WriterImpl$StructTreeWriter.writeStripe(OrcProto$StripeFooter$Builder, int) 584 6.565
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)