You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Geoffrey Jacoby (JIRA)" <ji...@apache.org> on 2016/12/02 18:33:58 UTC

[jira] [Commented] (PHOENIX-541) Make mutable batch size bytes-based instead of row-based

    [ https://issues.apache.org/jira/browse/PHOENIX-541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15715893#comment-15715893 ] 

Geoffrey Jacoby commented on PHOENIX-541:
-----------------------------------------

I spoke offline with [~samarthjain] and he suggested a better approach, where rather than throw an exception if a MutationState has too many bytes (as it does now with too many rows), to just transparently partition the list of mutations to be batched to HBase into sub-lists that are all smaller than the byte size boundary. This is the approach I adopted; patch is attached. 

 By default the max byte size is Long.MaxValue for backwards compatibility. 

> Make mutable batch size bytes-based instead of row-based
> --------------------------------------------------------
>
>                 Key: PHOENIX-541
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-541
>             Project: Phoenix
>          Issue Type: Improvement
>    Affects Versions: 3.0-Release
>            Reporter: mujtaba
>            Assignee: Geoffrey Jacoby
>              Labels: newbie
>             Fix For: 4.10.0
>
>         Attachments: PHOENIX-541.patch
>
>
> With current configuration of row-count based mutable batch size, ideal value for batch size is around 800 rather then current 15k when creating indexes based on memory consumption, CPU and GC (data size: key: ~60 bytes, 14 integer column in separate CFs)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)