You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Ravi Kishore Valeti (JIRA)" <ji...@apache.org> on 2015/10/13 16:19:05 UTC

[jira] [Commented] (PHOENIX-541) Make mutable batch size bytes-based instead of row-based

    [ https://issues.apache.org/jira/browse/PHOENIX-541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955008#comment-14955008 ] 

Ravi Kishore Valeti commented on PHOENIX-541:
---------------------------------------------

This JIRA can be extremely useful for Secondary MR based Index builds especially when they are running under memory constrained Containers.

Resource manager would kill the containers which overshoot  memory and re-trigger them at a later point - which can either affect overall job execution time or cause the job to fail.

We can apply dynamic batch sizing based on the available Memory for the Map/Reduce task so that tasks do not overshoot memory while batching.

ex: Map Max memory is set to 2 GB, Avg Row size is 2MB for a (wider) table & Batching is set to 1000 rows, then map task will have to keep 2GB worth mutations in Memory. This may lead to Resource Manager killing the task for overshooting memory & re-trigger later. Eventually either job might fail or might take a huge time to complete due to re-tries.

[~jamestaylor]

> Make mutable batch size bytes-based instead of row-based
> --------------------------------------------------------
>
>                 Key: PHOENIX-541
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-541
>             Project: Phoenix
>          Issue Type: Improvement
>    Affects Versions: 3.0-Release
>            Reporter: mujtaba
>              Labels: newbie
>
> With current configuration of row-count based mutable batch size, ideal value for batch size is around 800 rather then current 15k when creating indexes based on memory consumption, CPU and GC (data size: key: ~60 bytes, 14 integer column in separate CFs)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)