You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafodion.apache.org by "Suresh Subbiah (JIRA)" <ji...@apache.org> on 2015/11/05 21:26:27 UTC

[jira] [Commented] (TRAFODION-1550) Bulk load performance can be improved by increasing HBASE_ROWSET_VSBB_SIZE

    [ https://issues.apache.org/jira/browse/TRAFODION-1550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14992399#comment-14992399 ] 

Suresh Subbiah commented on TRAFODION-1550:
-------------------------------------------

TRAF_LOAD_FLUSH_SIZE_IN_KB default added. It has a default value of 1024.
Gednerator will convert this flush size in KB to number of rows and use that to specify insert size when executor writes to HFile. Regardless of value of this default flush size in rows is capped at 32767. Explain plan will show the flush size in rows used for each load.


> Bulk load performance can be improved by increasing HBASE_ROWSET_VSBB_SIZE
> --------------------------------------------------------------------------
>
>                 Key: TRAFODION-1550
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-1550
>             Project: Apache Trafodion
>          Issue Type: Improvement
>          Components: sql-general
>            Reporter: Suresh Subbiah
>            Assignee: Suresh Subbiah
>             Fix For: 1.2-incubating
>
>
> Bulk load flushes rows to HFile in batches of size HBASE_ROWSET_VSBB_SIZE. The default value for this cqd is 1024. Aflush size of 1024 rows is small, particularly for narrow tables like TPC-H lineitem (~150 bytes per row).
> Increasing the flush size to 10,000 or 20,000 rows caused performance to improve by more than 100%. Please configure code to determine a more ideal flush size for a given table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)