You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@aurora.apache.org by "Kevin Sweeney (JIRA)" <ji...@apache.org> on 2014/09/17 20:05:33 UTC

[jira] [Updated] (AURORA-722) snapshot performance issues

     [ https://issues.apache.org/jira/browse/AURORA-722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kevin Sweeney updated AURORA-722:
---------------------------------
    Sprint: Aurora Q3 Sprint 2

> snapshot performance issues
> ---------------------------
>
>                 Key: AURORA-722
>                 URL: https://issues.apache.org/jira/browse/AURORA-722
>             Project: Aurora
>          Issue Type: Bug
>          Components: Scheduler
>            Reporter: Kevin Sweeney
>            Assignee: Kevin Sweeney
>             Fix For: 0.6.0
>
>
> In one of our larger production clusters we're seeing issues with snapshot performance that cause the scheduler to failover before completing a snapshot.
> For background, the scheduler writes a compressed (when -deflate_snapshots is enabled), binary-encoded Snapshot (from api.thrift) to the mesos replicated log every hour (or -dlog_snapshot_interval). This snapshot represents most of the scheduler's heap usage, including the configuration for all tasks running in the cluster.
> Add appropriate instrumentation to the snapshot routine and patch any obvious performance bottlenecks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)