You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@geode.apache.org by Mario Kevo <ma...@est.tech> on 2019/10/02 08:20:27 UTC

How to post-process data from Geode backup

Hi geode-dev,

How to post-process data in a Geode backup, i.e. access the data in the
regions stored in the backup?
 
Assumption is that we cannot bring up the same cluster (otherwise we
simply do the restore) but it helps if we have access to the original
cluster (for the purpose of obtaining PDX registry). Since this is
quite often not possible, procedure must work completely off-line.
 
So, here goes:
1. Get the backup files.
2. Start up gfsh and using „export offline-disk-store“ command (once
for each disk-store) obtain the GFD files for all regions in the disk
stores. Note these GFD files (one per region) do not contain PDX
registry! This is exactly opposite to „export data“ command (that is
execute online).


Do you have any idea how to load it into new cluster? Is it possible?
Is this problem with „export offline-disk-store“?

BR,
Mario

Re: How to post-process data from Geode backup

Posted by Anthony Baker <ab...@pivotal.io>.
Hmmm, good question.  If I’m reading the code right we skip exporting PDX types from an offline disk store (because the cache is closed).  Since the export doesn’t contain any PDX type definitions, they won’t get recreated during an import.

You could verify this by running “java org.apache.geode.internal.cache.snapshot.GFSnapshot …” with the right class path / args and seeing if any PDX types are dumped to stdout.

I think this *could* be done with some changes to ExportDiskRegion and GFSnapshot but it doesn’t look like it’s supported right now.

(Thinking bigger picture, I’d like to have better superuser commands to export/edit/import the PDX registry as well)

Anthony


> On Oct 2, 2019, at 1:20 AM, Mario Kevo <ma...@est.tech> wrote:
> 
> Hi geode-dev,
> 
> How to post-process data in a Geode backup, i.e. access the data in the
> regions stored in the backup?
> 
> Assumption is that we cannot bring up the same cluster (otherwise we
> simply do the restore) but it helps if we have access to the original
> cluster (for the purpose of obtaining PDX registry). Since this is
> quite often not possible, procedure must work completely off-line.
> 
> So, here goes:
> 1. Get the backup files.
> 2. Start up gfsh and using „export offline-disk-store“ command (once
> for each disk-store) obtain the GFD files for all regions in the disk
> stores. Note these GFD files (one per region) do not contain PDX
> registry! This is exactly opposite to „export data“ command (that is
> execute online).
> 
> 
> Do you have any idea how to load it into new cluster? Is it possible?
> Is this problem with „export offline-disk-store“?
> 
> BR,
> Mario