You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by "Jason Gerlowski (Jira)" <ji...@apache.org> on 2021/08/09 20:37:00 UTC

[jira] [Commented] (SOLR-15500) Compressed Backup

    [ https://issues.apache.org/jira/browse/SOLR-15500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17396272#comment-17396272 ] 

Jason Gerlowski commented on SOLR-15500:
----------------------------------------

Thanks for offering to help out Sayan, maybe we can get into specifics a bit.

(Sorry for the late reply - quite behind on mail - hopefully your offer still stands.)

You may already be aware of this, but Solr recently added support for "incremental backups" in Solr 8.9 (see [here|https://issues.apache.org/jira/browse/SOLR-15086] or [here|https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore]).  This work changed the backup format so that multiple backups can now live side-by-side in the same location. Repeated backups of the same collection are smart enough to avoid uploading files that haven't changed since being uploaded by some previous backup.  This brings big efficiency improvements for that common backup usecase which is cool.

But it does complicate your compression use case some, unfortunately.  Compression is easy to imagine on Solr's legacy backup structure, but backups using the new file structure have Solr scan the files already present to identify which index files have been uploaded by a previous backup.  Were you aware of the new format changes by chance, and if so, did you have any ideas how that might be handled?  I guess we would just leave the uncompressed files around after creating a zip/tarball of them?  Or maybe compression is only something we'd support in the legacy backup file format?

Had you thought at all about how you'd do the compression?

> Compressed Backup
> -----------------
>
>                 Key: SOLR-15500
>                 URL: https://issues.apache.org/jira/browse/SOLR-15500
>             Project: Solr
>          Issue Type: New Feature
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Sayan Das
>            Priority: Major
>
> Right now in BliBli, we do dirty hacks to compress backups from the backup scheduler VMs. It would be great if we can improve collection BACKUP command with some expert flag which can compress the backup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org