You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@zookeeper.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/06/30 08:52:00 UTC
[jira] [Updated] (ZOOKEEPER-4566) Create tool for recursive snapshot analysis
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated ZOOKEEPER-4566:
--------------------------------------
Labels: pull-request-available (was: )
> Create tool for recursive snapshot analysis
> -------------------------------------------
>
> Key: ZOOKEEPER-4566
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4566
> Project: ZooKeeper
> Issue Type: Improvement
> Reporter: Szabolcs Bukros
> Priority: Major
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> I needed to analyze snapshots to determine which application caused a massive snapshot size increase by recursively checking child node count and data size for nodes, but could not find a tool for the job. Loading the snapshots one by one and using a ZooKeeper client proved too slow and SnapshotFormatter was very fast but processing the output to get the relevant data for my usecase proved more work than writing a tool that has the output I need. So I wrote SnapshotSumFormatter based on SnapshotFormatter:
> {code:java}
> USAGE: SnapshotSumFormatter snapshot_file starting_node max_depth
> {code}
> The tool recursively travels the child nodes under "starting_node" and collects both node count and summarizes the data stored in every node under the current one. This helps to identify problematic jobs/applications that either store too much data or does not properly clean up. "max_depth" defines the depth where the tool still writes to the output. 0 means there is no depth limit, every node's stats will be displayed, 1 means it will only contain the starting node's and it's children's stats, 2 ads another level and so on. This ONLY affects the level of details displayed, NOT the calculation.
> An example output looks like this (with "SnapshotSumFormatter <snapshot_file> / 2"):
> {code:java}
> /
> children: 1250511
> data: 1952186580
> -- /zookeeper
> -- children: 1
> -- data: 0
> -- /solr
> -- children: 1773
> -- data: 8419162
> ---- /solr/configs
> ---- children: 1640
> ---- data: 8407643
> ---- /solr/overseer
> ---- children: 6
> ---- data: 0
> ---- /solr/live_nodes
> ---- children: 3
> ---- data: 0
> {code}
> I think this might prove useful for others too and would like to share it.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)