You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Geoffrey Jacoby (JIRA)" <ji...@apache.org> on 2016/12/02 18:33:58 UTC
[jira] [Commented] (PHOENIX-541) Make mutable batch size
bytes-based instead of row-based
[ https://issues.apache.org/jira/browse/PHOENIX-541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15715893#comment-15715893 ]
Geoffrey Jacoby commented on PHOENIX-541:
-----------------------------------------
I spoke offline with [~samarthjain] and he suggested a better approach, where rather than throw an exception if a MutationState has too many bytes (as it does now with too many rows), to just transparently partition the list of mutations to be batched to HBase into sub-lists that are all smaller than the byte size boundary. This is the approach I adopted; patch is attached.
By default the max byte size is Long.MaxValue for backwards compatibility.
> Make mutable batch size bytes-based instead of row-based
> --------------------------------------------------------
>
> Key: PHOENIX-541
> URL: https://issues.apache.org/jira/browse/PHOENIX-541
> Project: Phoenix
> Issue Type: Improvement
> Affects Versions: 3.0-Release
> Reporter: mujtaba
> Assignee: Geoffrey Jacoby
> Labels: newbie
> Fix For: 4.10.0
>
> Attachments: PHOENIX-541.patch
>
>
> With current configuration of row-count based mutable batch size, ideal value for batch size is around 800 rather then current 15k when creating indexes based on memory consumption, CPU and GC (data size: key: ~60 bytes, 14 integer column in separate CFs)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)