You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Alex P <ap...@kolosy.com> on 2009/10/19 21:18:26 UTC

couch behavior for san snapshots

hello,

we ran into an issue this weekend where taking a SAN (amazon ebs) snapshot
of a couchdb mount somehow brought over only part of the data. specifically,
there were two small dbs (call them db1 and db2) on the couch instance, each
with several hundred documents. taking the snapshot and bringing it up again
showed both databases, but one had 0 documents and an update sequence of 0.
the instance the snapshot was taken from still has both databases showing
with several hundred databases. we repeated this process several times, with
the same effect, on the same db.

any thoughts would be much appreciated, as we are about to go live, and not
being able to do a backup of our db is ... disturbing.

thanks,
alex.

Re: couch behavior for san snapshots

Posted by Alex P <ap...@kolosy.com>.
we're using .91. going to switch to .10 in the next few days. i'll do the
diff tonight.

On Mon, Oct 19, 2009 at 2:42 PM, Chris Anderson <jc...@apache.org> wrote:

> On Mon, Oct 19, 2009 at 12:37 PM, Alex P <ap...@kolosy.com> wrote:
> > i'll take a peek tonight, but it just doesn't sound like corruption. the
> db
> > is visible, readable, writable, just has an update sequence of 0, 0 docs
> and
> > a db size (as reported by futon) of 4.0kb.
> >
> > is db size computed off the file size, or based on actual contents?
> because
> > if it's the file size, it really doesn't sound like corruption...
> >
>
> I think it just reports the file size. So if the SAN-restore only
> brought back 4kb, then there's something to investigate.
>
> It could be that the SAN inserted a few bytes randomly in the file, so
> CouchDB couldn't find a btree root at the position indicated by the
> header.
>
> Which version of CouchDB are you using?
>
> I think it's worth diffing the before-restore and after-restore files.
> If the after-restore is really only 4kb then of course CouchDB isn't
> seeing docs in it.
>
>
>
>
> > (fwiw this is a separate, yet similar issue from the one i had earlier
> with
> > data disappearing from a live system)
> >
> > On Mon, Oct 19, 2009 at 2:35 PM, Alex P <ap...@kolosy.com> wrote:
> >
> >> cdb catalogs databases purely on file name and presence in the
> appropriate
> >> directory, right? so if i copy db1 to db1_copy, i should see db1_copy
> show
> >> up in futon?
> >>
> >>
> >> On Mon, Oct 19, 2009 at 2:23 PM, Paul Davis <
> paul.joseph.davis@gmail.com>wrote:
> >>
> >>> On Mon, Oct 19, 2009 at 3:18 PM, Alex P <ap...@kolosy.com> wrote:
> >>> > hello,
> >>> >
> >>> > we ran into an issue this weekend where taking a SAN (amazon ebs)
> >>> snapshot
> >>> > of a couchdb mount somehow brought over only part of the data.
> >>> specifically,
> >>> > there were two small dbs (call them db1 and db2) on the couch
> instance,
> >>> each
> >>> > with several hundred documents. taking the snapshot and bringing it
> up
> >>> again
> >>> > showed both databases, but one had 0 documents and an update sequence
> of
> >>> 0.
> >>> > the instance the snapshot was taken from still has both databases
> >>> showing
> >>> > with several hundred databases. we repeated this process several
> times,
> >>> with
> >>> > the same effect, on the same db.
> >>> >
> >>> > any thoughts would be much appreciated, as we are about to go live,
> and
> >>> not
> >>> > being able to do a backup of our db is ... disturbing.
> >>> >
> >>> > thanks,
> >>> > alex.
> >>> >
> >>>
> >>> Alex,
> >>>
> >>> That's most odd. What happens if you try and cp the database file to a
> >>> new name? Something like:
> >>>
> >>> $ cp /usr/local/var/lib/couchdb/db1.couch
> >>> /usr/local/var/lib/couchdb/db3.couch
> >>>
> >>> Another thing to try would be:
> >>>
> >>> $ curl -X POST http://127.0.0.1:5984/db1/_ensure_full_commit
> >>>
> >>> And then try to snapshot or copy again. I highly doubt that POST would
> >>> affect anything, but it might be worth a shot.
> >>>
> >>> Paul Davis
> >>>
> >>
> >>
> >
>
>
>
> --
> Chris Anderson
> http://jchrisa.net
> http://couch.io
>

Re: couch behavior for san snapshots

Posted by Chris Anderson <jc...@apache.org>.
On Mon, Oct 19, 2009 at 12:37 PM, Alex P <ap...@kolosy.com> wrote:
> i'll take a peek tonight, but it just doesn't sound like corruption. the db
> is visible, readable, writable, just has an update sequence of 0, 0 docs and
> a db size (as reported by futon) of 4.0kb.
>
> is db size computed off the file size, or based on actual contents? because
> if it's the file size, it really doesn't sound like corruption...
>

I think it just reports the file size. So if the SAN-restore only
brought back 4kb, then there's something to investigate.

It could be that the SAN inserted a few bytes randomly in the file, so
CouchDB couldn't find a btree root at the position indicated by the
header.

Which version of CouchDB are you using?

I think it's worth diffing the before-restore and after-restore files.
If the after-restore is really only 4kb then of course CouchDB isn't
seeing docs in it.




> (fwiw this is a separate, yet similar issue from the one i had earlier with
> data disappearing from a live system)
>
> On Mon, Oct 19, 2009 at 2:35 PM, Alex P <ap...@kolosy.com> wrote:
>
>> cdb catalogs databases purely on file name and presence in the appropriate
>> directory, right? so if i copy db1 to db1_copy, i should see db1_copy show
>> up in futon?
>>
>>
>> On Mon, Oct 19, 2009 at 2:23 PM, Paul Davis <pa...@gmail.com>wrote:
>>
>>> On Mon, Oct 19, 2009 at 3:18 PM, Alex P <ap...@kolosy.com> wrote:
>>> > hello,
>>> >
>>> > we ran into an issue this weekend where taking a SAN (amazon ebs)
>>> snapshot
>>> > of a couchdb mount somehow brought over only part of the data.
>>> specifically,
>>> > there were two small dbs (call them db1 and db2) on the couch instance,
>>> each
>>> > with several hundred documents. taking the snapshot and bringing it up
>>> again
>>> > showed both databases, but one had 0 documents and an update sequence of
>>> 0.
>>> > the instance the snapshot was taken from still has both databases
>>> showing
>>> > with several hundred databases. we repeated this process several times,
>>> with
>>> > the same effect, on the same db.
>>> >
>>> > any thoughts would be much appreciated, as we are about to go live, and
>>> not
>>> > being able to do a backup of our db is ... disturbing.
>>> >
>>> > thanks,
>>> > alex.
>>> >
>>>
>>> Alex,
>>>
>>> That's most odd. What happens if you try and cp the database file to a
>>> new name? Something like:
>>>
>>> $ cp /usr/local/var/lib/couchdb/db1.couch
>>> /usr/local/var/lib/couchdb/db3.couch
>>>
>>> Another thing to try would be:
>>>
>>> $ curl -X POST http://127.0.0.1:5984/db1/_ensure_full_commit
>>>
>>> And then try to snapshot or copy again. I highly doubt that POST would
>>> affect anything, but it might be worth a shot.
>>>
>>> Paul Davis
>>>
>>
>>
>



-- 
Chris Anderson
http://jchrisa.net
http://couch.io

Re: couch behavior for san snapshots

Posted by Alex P <ap...@kolosy.com>.
i'll take a peek tonight, but it just doesn't sound like corruption. the db
is visible, readable, writable, just has an update sequence of 0, 0 docs and
a db size (as reported by futon) of 4.0kb.

is db size computed off the file size, or based on actual contents? because
if it's the file size, it really doesn't sound like corruption...

(fwiw this is a separate, yet similar issue from the one i had earlier with
data disappearing from a live system)

On Mon, Oct 19, 2009 at 2:35 PM, Alex P <ap...@kolosy.com> wrote:

> cdb catalogs databases purely on file name and presence in the appropriate
> directory, right? so if i copy db1 to db1_copy, i should see db1_copy show
> up in futon?
>
>
> On Mon, Oct 19, 2009 at 2:23 PM, Paul Davis <pa...@gmail.com>wrote:
>
>> On Mon, Oct 19, 2009 at 3:18 PM, Alex P <ap...@kolosy.com> wrote:
>> > hello,
>> >
>> > we ran into an issue this weekend where taking a SAN (amazon ebs)
>> snapshot
>> > of a couchdb mount somehow brought over only part of the data.
>> specifically,
>> > there were two small dbs (call them db1 and db2) on the couch instance,
>> each
>> > with several hundred documents. taking the snapshot and bringing it up
>> again
>> > showed both databases, but one had 0 documents and an update sequence of
>> 0.
>> > the instance the snapshot was taken from still has both databases
>> showing
>> > with several hundred databases. we repeated this process several times,
>> with
>> > the same effect, on the same db.
>> >
>> > any thoughts would be much appreciated, as we are about to go live, and
>> not
>> > being able to do a backup of our db is ... disturbing.
>> >
>> > thanks,
>> > alex.
>> >
>>
>> Alex,
>>
>> That's most odd. What happens if you try and cp the database file to a
>> new name? Something like:
>>
>> $ cp /usr/local/var/lib/couchdb/db1.couch
>> /usr/local/var/lib/couchdb/db3.couch
>>
>> Another thing to try would be:
>>
>> $ curl -X POST http://127.0.0.1:5984/db1/_ensure_full_commit
>>
>> And then try to snapshot or copy again. I highly doubt that POST would
>> affect anything, but it might be worth a shot.
>>
>> Paul Davis
>>
>
>

Re: couch behavior for san snapshots

Posted by Paul Davis <pa...@gmail.com>.
Alex,

I haven't the slightest why that happens. I tried googling at one
point but never found anything that seemed related.

Paul Davis

On Thu, Oct 22, 2009 at 11:38 AM, Alex P <ap...@kolosy.com> wrote:
> Paul - yes, that's how it was installed. We've read about issues surrounding
> this type of deployment and are rearranging our installation to have couch
> on ephemeral and data on ebs. Any conjecture why couch would be able to
> access the data files and not the app files on ebs?
>
> On Thu, Oct 22, 2009 at 10:05 AM, Paul Davis <pa...@gmail.com>wrote:
>
>> Alex,
>>
>> Are any of the beam files used by the running CouchDB node on the EBS
>> volume? I've seen these errors before and they're generally (always?)
>> a result of the Erlang VM being unable to read from its current
>> working directory.
>>
>> Paul Davis
>>
>>
>

Re: couch behavior for san snapshots

Posted by Andrew Melo <an...@gmail.com>.
Might have to do with some interaction on how they cache things. But
I'm not an expert by any means.

--
Andrew Melo
Andrew.Melo@gmail.com



On Thu, Oct 22, 2009 at 10:38 AM, Alex P <ap...@kolosy.com> wrote:
> Paul - yes, that's how it was installed. We've read about issues surrounding
> this type of deployment and are rearranging our installation to have couch
> on ephemeral and data on ebs. Any conjecture why couch would be able to
> access the data files and not the app files on ebs?
>
> On Thu, Oct 22, 2009 at 10:05 AM, Paul Davis <pa...@gmail.com>wrote:
>
>> Alex,
>>
>> Are any of the beam files used by the running CouchDB node on the EBS
>> volume? I've seen these errors before and they're generally (always?)
>> a result of the Erlang VM being unable to read from its current
>> working directory.
>>
>> Paul Davis
>>
>>
>

Re: couch behavior for san snapshots

Posted by Alex P <ap...@kolosy.com>.
Paul - yes, that's how it was installed. We've read about issues surrounding
this type of deployment and are rearranging our installation to have couch
on ephemeral and data on ebs. Any conjecture why couch would be able to
access the data files and not the app files on ebs?

On Thu, Oct 22, 2009 at 10:05 AM, Paul Davis <pa...@gmail.com>wrote:

> Alex,
>
> Are any of the beam files used by the running CouchDB node on the EBS
> volume? I've seen these errors before and they're generally (always?)
> a result of the Erlang VM being unable to read from its current
> working directory.
>
> Paul Davis
>
>

Re: couch behavior for san snapshots

Posted by Paul Davis <pa...@gmail.com>.
Alex,

Are any of the beam files used by the running CouchDB node on the EBS
volume? I've seen these errors before and they're generally (always?)
a result of the Erlang VM being unable to read from its current
working directory.

Paul Davis

On Thu, Oct 22, 2009 at 2:28 AM, Alex P <ap...@kolosy.com> wrote:
> so an update on this situation. we managed to replicate the problem. took a
> san snapshot, got an empty db. here's where it gets tricky. we dug around
> the log files, and found error messages that matched these:
>
> http://mail-archives.apache.org/mod_mbox/couchdb-user/200812.mbox/%3C4c69d7170812181239j6527a387oa63e129e20582b8f@mail.gmail.com%3E
>
> on a hunch, we ran a compaction... and voila! all docs are back.
>
> again, this is on .91 (we're moving to .10 tonight/tomorrow), but any
> insight as to what could have caused this would be appreciated.
>
> thanks,
> alex
>
> On Mon, Oct 19, 2009 at 11:10 PM, Alex P <ap...@kolosy.com> wrote:
>
>> we're using xfs, but thanks for the offer.
>>
>> looks like someone discarded the snapshot that we had, so i can't compare
>> the database. the only new piece of information that i have is that ebs
>> snapshots are differential, and clearing out all history seems to have
>> 'resolved' the issue. not an answer i'm happy with, but it's no longer
>> replicatable.
>>
>>
>> On Mon, Oct 19, 2009 at 10:35 PM, Dusty Doris <co...@dusty.name>wrote:
>>
>>> On Mon, Oct 19, 2009 at 5:37 PM, Alex P <ap...@kolosy.com> wrote:
>>> > thanks! i'll definitely look into that.
>>>
>>> If your are using ext3 I can also help you with that.  You need to use
>>> LVM2 to create volumes and then you can do something like
>>>
>>> $ dmsetup suspend /dev/ebs/couchvol0
>>> $ ebs_snapshot
>>> $ dmsetup resume /dev/ebs/couchvol0
>>>
>>> If you'd like more info on that let me know and I'll add some more
>>> details.
>>>
>>>
>>>
>>> >
>>> > On Mon, Oct 19, 2009 at 4:32 PM, Glenn Rempe <gl...@rempe.us> wrote:
>>> >
>>> >> Not sure if this will help at all, but I am also using EBS to store a
>>> large
>>> >> cdb DB.
>>> >> It is recommended that the filesystem that you want to snapshot on EBS
>>> is
>>> >> frozen (or unmounted) if that is possible to help ensure a coherent EBS
>>> >> snapshot.
>>> >>
>>> >> For example, on my mysql and couchdb volumes I use the XFS filesystem
>>> which
>>> >> has an XFS freeze command which I use to temporarily lock the FS and
>>> flush
>>> >> it to disk, and then request the snapshot.  You can do this in a script
>>> and
>>> >> it takes only a few seconds of disk downtime.  I believe writes are
>>> cached
>>> >> during this period.  I have been using this with mysql but have not yet
>>> set
>>> >> it up for my couchdb volume.
>>> >>
>>> >> See :
>>> >>
>>> >>
>>> >>
>>> http://developer.amazonwebservices.com/connect/entry.jspa?categoryID=100&externalID=1663
>>> >> http://alestic.com/2009/09/ec2-consistent-snapshot
>>> >>
>>> >> HTH
>>> >>
>>> >> On Mon, Oct 19, 2009 at 1:55 PM, Chris Anderson <jc...@apache.org>
>>> wrote:
>>> >>
>>> >> > On Mon, Oct 19, 2009 at 12:53 PM, Alex P <ap...@kolosy.com>
>>> wrote:
>>> >> > > a simple copy seems to work just fine. db shows up with the right
>>> seq
>>> >> > number
>>> >> > > etc.
>>> >> >
>>> >> > cool. totally wanna learn about the SAN roundtrip. :)
>>> >> >
>>> >> > >
>>> >> > > On Mon, Oct 19, 2009 at 2:37 PM, Paul Davis <
>>> >> paul.joseph.davis@gmail.com
>>> >> > >wrote:
>>> >> > >
>>> >> > >> On Mon, Oct 19, 2009 at 3:35 PM, Alex P <ap...@kolosy.com>
>>> wrote:
>>> >> > >> > cdb catalogs databases purely on file name and presence in the
>>> >> > >> appropriate
>>> >> > >> > directory, right? so if i copy db1 to db1_copy, i should see
>>> >> db1_copy
>>> >> > >> show
>>> >> > >> > up in futon?
>>> >> > >>
>>> >> > >> Yepper. Make sure the permissions are correct after copying
>>> though.
>>> >> > >>
>>> >> > >> Paul Davis
>>> >> > >>
>>> >> > >
>>> >> >
>>> >> >
>>> >> >
>>> >> > --
>>> >> > Chris Anderson
>>> >> > http://jchrisa.net
>>> >> > http://couch.io
>>> >> >
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Glenn Rempe
>>> >>
>>> >> email                 : glenn@rempe.us
>>> >> voice                 : (415) 894-5366 or (415)-89G-LENN
>>> >> twitter                : @grempe
>>> >> contact info        : http://www.rempe.us/contact.html
>>> >> pgp                    : http://www.rempe.us/gnupg.txt
>>> >>
>>> >
>>>
>>
>>
>

Re: couch behavior for san snapshots

Posted by Alex P <ap...@kolosy.com>.
so an update on this situation. we managed to replicate the problem. took a
san snapshot, got an empty db. here's where it gets tricky. we dug around
the log files, and found error messages that matched these:

http://mail-archives.apache.org/mod_mbox/couchdb-user/200812.mbox/%3C4c69d7170812181239j6527a387oa63e129e20582b8f@mail.gmail.com%3E

on a hunch, we ran a compaction... and voila! all docs are back.

again, this is on .91 (we're moving to .10 tonight/tomorrow), but any
insight as to what could have caused this would be appreciated.

thanks,
alex

On Mon, Oct 19, 2009 at 11:10 PM, Alex P <ap...@kolosy.com> wrote:

> we're using xfs, but thanks for the offer.
>
> looks like someone discarded the snapshot that we had, so i can't compare
> the database. the only new piece of information that i have is that ebs
> snapshots are differential, and clearing out all history seems to have
> 'resolved' the issue. not an answer i'm happy with, but it's no longer
> replicatable.
>
>
> On Mon, Oct 19, 2009 at 10:35 PM, Dusty Doris <co...@dusty.name>wrote:
>
>> On Mon, Oct 19, 2009 at 5:37 PM, Alex P <ap...@kolosy.com> wrote:
>> > thanks! i'll definitely look into that.
>>
>> If your are using ext3 I can also help you with that.  You need to use
>> LVM2 to create volumes and then you can do something like
>>
>> $ dmsetup suspend /dev/ebs/couchvol0
>> $ ebs_snapshot
>> $ dmsetup resume /dev/ebs/couchvol0
>>
>> If you'd like more info on that let me know and I'll add some more
>> details.
>>
>>
>>
>> >
>> > On Mon, Oct 19, 2009 at 4:32 PM, Glenn Rempe <gl...@rempe.us> wrote:
>> >
>> >> Not sure if this will help at all, but I am also using EBS to store a
>> large
>> >> cdb DB.
>> >> It is recommended that the filesystem that you want to snapshot on EBS
>> is
>> >> frozen (or unmounted) if that is possible to help ensure a coherent EBS
>> >> snapshot.
>> >>
>> >> For example, on my mysql and couchdb volumes I use the XFS filesystem
>> which
>> >> has an XFS freeze command which I use to temporarily lock the FS and
>> flush
>> >> it to disk, and then request the snapshot.  You can do this in a script
>> and
>> >> it takes only a few seconds of disk downtime.  I believe writes are
>> cached
>> >> during this period.  I have been using this with mysql but have not yet
>> set
>> >> it up for my couchdb volume.
>> >>
>> >> See :
>> >>
>> >>
>> >>
>> http://developer.amazonwebservices.com/connect/entry.jspa?categoryID=100&externalID=1663
>> >> http://alestic.com/2009/09/ec2-consistent-snapshot
>> >>
>> >> HTH
>> >>
>> >> On Mon, Oct 19, 2009 at 1:55 PM, Chris Anderson <jc...@apache.org>
>> wrote:
>> >>
>> >> > On Mon, Oct 19, 2009 at 12:53 PM, Alex P <ap...@kolosy.com>
>> wrote:
>> >> > > a simple copy seems to work just fine. db shows up with the right
>> seq
>> >> > number
>> >> > > etc.
>> >> >
>> >> > cool. totally wanna learn about the SAN roundtrip. :)
>> >> >
>> >> > >
>> >> > > On Mon, Oct 19, 2009 at 2:37 PM, Paul Davis <
>> >> paul.joseph.davis@gmail.com
>> >> > >wrote:
>> >> > >
>> >> > >> On Mon, Oct 19, 2009 at 3:35 PM, Alex P <ap...@kolosy.com>
>> wrote:
>> >> > >> > cdb catalogs databases purely on file name and presence in the
>> >> > >> appropriate
>> >> > >> > directory, right? so if i copy db1 to db1_copy, i should see
>> >> db1_copy
>> >> > >> show
>> >> > >> > up in futon?
>> >> > >>
>> >> > >> Yepper. Make sure the permissions are correct after copying
>> though.
>> >> > >>
>> >> > >> Paul Davis
>> >> > >>
>> >> > >
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > Chris Anderson
>> >> > http://jchrisa.net
>> >> > http://couch.io
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Glenn Rempe
>> >>
>> >> email                 : glenn@rempe.us
>> >> voice                 : (415) 894-5366 or (415)-89G-LENN
>> >> twitter                : @grempe
>> >> contact info        : http://www.rempe.us/contact.html
>> >> pgp                    : http://www.rempe.us/gnupg.txt
>> >>
>> >
>>
>
>

Re: couch behavior for san snapshots

Posted by Alex P <ap...@kolosy.com>.
we're using xfs, but thanks for the offer.

looks like someone discarded the snapshot that we had, so i can't compare
the database. the only new piece of information that i have is that ebs
snapshots are differential, and clearing out all history seems to have
'resolved' the issue. not an answer i'm happy with, but it's no longer
replicatable.

On Mon, Oct 19, 2009 at 10:35 PM, Dusty Doris <co...@dusty.name>wrote:

> On Mon, Oct 19, 2009 at 5:37 PM, Alex P <ap...@kolosy.com> wrote:
> > thanks! i'll definitely look into that.
>
> If your are using ext3 I can also help you with that.  You need to use
> LVM2 to create volumes and then you can do something like
>
> $ dmsetup suspend /dev/ebs/couchvol0
> $ ebs_snapshot
> $ dmsetup resume /dev/ebs/couchvol0
>
> If you'd like more info on that let me know and I'll add some more details.
>
>
>
> >
> > On Mon, Oct 19, 2009 at 4:32 PM, Glenn Rempe <gl...@rempe.us> wrote:
> >
> >> Not sure if this will help at all, but I am also using EBS to store a
> large
> >> cdb DB.
> >> It is recommended that the filesystem that you want to snapshot on EBS
> is
> >> frozen (or unmounted) if that is possible to help ensure a coherent EBS
> >> snapshot.
> >>
> >> For example, on my mysql and couchdb volumes I use the XFS filesystem
> which
> >> has an XFS freeze command which I use to temporarily lock the FS and
> flush
> >> it to disk, and then request the snapshot.  You can do this in a script
> and
> >> it takes only a few seconds of disk downtime.  I believe writes are
> cached
> >> during this period.  I have been using this with mysql but have not yet
> set
> >> it up for my couchdb volume.
> >>
> >> See :
> >>
> >>
> >>
> http://developer.amazonwebservices.com/connect/entry.jspa?categoryID=100&externalID=1663
> >> http://alestic.com/2009/09/ec2-consistent-snapshot
> >>
> >> HTH
> >>
> >> On Mon, Oct 19, 2009 at 1:55 PM, Chris Anderson <jc...@apache.org>
> wrote:
> >>
> >> > On Mon, Oct 19, 2009 at 12:53 PM, Alex P <ap...@kolosy.com> wrote:
> >> > > a simple copy seems to work just fine. db shows up with the right
> seq
> >> > number
> >> > > etc.
> >> >
> >> > cool. totally wanna learn about the SAN roundtrip. :)
> >> >
> >> > >
> >> > > On Mon, Oct 19, 2009 at 2:37 PM, Paul Davis <
> >> paul.joseph.davis@gmail.com
> >> > >wrote:
> >> > >
> >> > >> On Mon, Oct 19, 2009 at 3:35 PM, Alex P <ap...@kolosy.com>
> wrote:
> >> > >> > cdb catalogs databases purely on file name and presence in the
> >> > >> appropriate
> >> > >> > directory, right? so if i copy db1 to db1_copy, i should see
> >> db1_copy
> >> > >> show
> >> > >> > up in futon?
> >> > >>
> >> > >> Yepper. Make sure the permissions are correct after copying though.
> >> > >>
> >> > >> Paul Davis
> >> > >>
> >> > >
> >> >
> >> >
> >> >
> >> > --
> >> > Chris Anderson
> >> > http://jchrisa.net
> >> > http://couch.io
> >> >
> >>
> >>
> >>
> >> --
> >> Glenn Rempe
> >>
> >> email                 : glenn@rempe.us
> >> voice                 : (415) 894-5366 or (415)-89G-LENN
> >> twitter                : @grempe
> >> contact info        : http://www.rempe.us/contact.html
> >> pgp                    : http://www.rempe.us/gnupg.txt
> >>
> >
>

Re: couch behavior for san snapshots

Posted by Dusty Doris <co...@dusty.name>.
On Mon, Oct 19, 2009 at 5:37 PM, Alex P <ap...@kolosy.com> wrote:
> thanks! i'll definitely look into that.

If your are using ext3 I can also help you with that.  You need to use
LVM2 to create volumes and then you can do something like

$ dmsetup suspend /dev/ebs/couchvol0
$ ebs_snapshot
$ dmsetup resume /dev/ebs/couchvol0

If you'd like more info on that let me know and I'll add some more details.



>
> On Mon, Oct 19, 2009 at 4:32 PM, Glenn Rempe <gl...@rempe.us> wrote:
>
>> Not sure if this will help at all, but I am also using EBS to store a large
>> cdb DB.
>> It is recommended that the filesystem that you want to snapshot on EBS is
>> frozen (or unmounted) if that is possible to help ensure a coherent EBS
>> snapshot.
>>
>> For example, on my mysql and couchdb volumes I use the XFS filesystem which
>> has an XFS freeze command which I use to temporarily lock the FS and flush
>> it to disk, and then request the snapshot.  You can do this in a script and
>> it takes only a few seconds of disk downtime.  I believe writes are cached
>> during this period.  I have been using this with mysql but have not yet set
>> it up for my couchdb volume.
>>
>> See :
>>
>>
>> http://developer.amazonwebservices.com/connect/entry.jspa?categoryID=100&externalID=1663
>> http://alestic.com/2009/09/ec2-consistent-snapshot
>>
>> HTH
>>
>> On Mon, Oct 19, 2009 at 1:55 PM, Chris Anderson <jc...@apache.org> wrote:
>>
>> > On Mon, Oct 19, 2009 at 12:53 PM, Alex P <ap...@kolosy.com> wrote:
>> > > a simple copy seems to work just fine. db shows up with the right seq
>> > number
>> > > etc.
>> >
>> > cool. totally wanna learn about the SAN roundtrip. :)
>> >
>> > >
>> > > On Mon, Oct 19, 2009 at 2:37 PM, Paul Davis <
>> paul.joseph.davis@gmail.com
>> > >wrote:
>> > >
>> > >> On Mon, Oct 19, 2009 at 3:35 PM, Alex P <ap...@kolosy.com> wrote:
>> > >> > cdb catalogs databases purely on file name and presence in the
>> > >> appropriate
>> > >> > directory, right? so if i copy db1 to db1_copy, i should see
>> db1_copy
>> > >> show
>> > >> > up in futon?
>> > >>
>> > >> Yepper. Make sure the permissions are correct after copying though.
>> > >>
>> > >> Paul Davis
>> > >>
>> > >
>> >
>> >
>> >
>> > --
>> > Chris Anderson
>> > http://jchrisa.net
>> > http://couch.io
>> >
>>
>>
>>
>> --
>> Glenn Rempe
>>
>> email                 : glenn@rempe.us
>> voice                 : (415) 894-5366 or (415)-89G-LENN
>> twitter                : @grempe
>> contact info        : http://www.rempe.us/contact.html
>> pgp                    : http://www.rempe.us/gnupg.txt
>>
>

Re: couch behavior for san snapshots

Posted by Alex P <ap...@kolosy.com>.
thanks! i'll definitely look into that.

On Mon, Oct 19, 2009 at 4:32 PM, Glenn Rempe <gl...@rempe.us> wrote:

> Not sure if this will help at all, but I am also using EBS to store a large
> cdb DB.
> It is recommended that the filesystem that you want to snapshot on EBS is
> frozen (or unmounted) if that is possible to help ensure a coherent EBS
> snapshot.
>
> For example, on my mysql and couchdb volumes I use the XFS filesystem which
> has an XFS freeze command which I use to temporarily lock the FS and flush
> it to disk, and then request the snapshot.  You can do this in a script and
> it takes only a few seconds of disk downtime.  I believe writes are cached
> during this period.  I have been using this with mysql but have not yet set
> it up for my couchdb volume.
>
> See :
>
>
> http://developer.amazonwebservices.com/connect/entry.jspa?categoryID=100&externalID=1663
> http://alestic.com/2009/09/ec2-consistent-snapshot
>
> HTH
>
> On Mon, Oct 19, 2009 at 1:55 PM, Chris Anderson <jc...@apache.org> wrote:
>
> > On Mon, Oct 19, 2009 at 12:53 PM, Alex P <ap...@kolosy.com> wrote:
> > > a simple copy seems to work just fine. db shows up with the right seq
> > number
> > > etc.
> >
> > cool. totally wanna learn about the SAN roundtrip. :)
> >
> > >
> > > On Mon, Oct 19, 2009 at 2:37 PM, Paul Davis <
> paul.joseph.davis@gmail.com
> > >wrote:
> > >
> > >> On Mon, Oct 19, 2009 at 3:35 PM, Alex P <ap...@kolosy.com> wrote:
> > >> > cdb catalogs databases purely on file name and presence in the
> > >> appropriate
> > >> > directory, right? so if i copy db1 to db1_copy, i should see
> db1_copy
> > >> show
> > >> > up in futon?
> > >>
> > >> Yepper. Make sure the permissions are correct after copying though.
> > >>
> > >> Paul Davis
> > >>
> > >
> >
> >
> >
> > --
> > Chris Anderson
> > http://jchrisa.net
> > http://couch.io
> >
>
>
>
> --
> Glenn Rempe
>
> email                 : glenn@rempe.us
> voice                 : (415) 894-5366 or (415)-89G-LENN
> twitter                : @grempe
> contact info        : http://www.rempe.us/contact.html
> pgp                    : http://www.rempe.us/gnupg.txt
>

Re: couch behavior for san snapshots

Posted by Glenn Rempe <gl...@rempe.us>.
Not sure if this will help at all, but I am also using EBS to store a large
cdb DB.
It is recommended that the filesystem that you want to snapshot on EBS is
frozen (or unmounted) if that is possible to help ensure a coherent EBS
snapshot.

For example, on my mysql and couchdb volumes I use the XFS filesystem which
has an XFS freeze command which I use to temporarily lock the FS and flush
it to disk, and then request the snapshot.  You can do this in a script and
it takes only a few seconds of disk downtime.  I believe writes are cached
during this period.  I have been using this with mysql but have not yet set
it up for my couchdb volume.

See :

http://developer.amazonwebservices.com/connect/entry.jspa?categoryID=100&externalID=1663
http://alestic.com/2009/09/ec2-consistent-snapshot

HTH

On Mon, Oct 19, 2009 at 1:55 PM, Chris Anderson <jc...@apache.org> wrote:

> On Mon, Oct 19, 2009 at 12:53 PM, Alex P <ap...@kolosy.com> wrote:
> > a simple copy seems to work just fine. db shows up with the right seq
> number
> > etc.
>
> cool. totally wanna learn about the SAN roundtrip. :)
>
> >
> > On Mon, Oct 19, 2009 at 2:37 PM, Paul Davis <paul.joseph.davis@gmail.com
> >wrote:
> >
> >> On Mon, Oct 19, 2009 at 3:35 PM, Alex P <ap...@kolosy.com> wrote:
> >> > cdb catalogs databases purely on file name and presence in the
> >> appropriate
> >> > directory, right? so if i copy db1 to db1_copy, i should see db1_copy
> >> show
> >> > up in futon?
> >>
> >> Yepper. Make sure the permissions are correct after copying though.
> >>
> >> Paul Davis
> >>
> >
>
>
>
> --
> Chris Anderson
> http://jchrisa.net
> http://couch.io
>



-- 
Glenn Rempe

email                 : glenn@rempe.us
voice                 : (415) 894-5366 or (415)-89G-LENN
twitter                : @grempe
contact info        : http://www.rempe.us/contact.html
pgp                    : http://www.rempe.us/gnupg.txt

Re: couch behavior for san snapshots

Posted by Chris Anderson <jc...@apache.org>.
On Mon, Oct 19, 2009 at 12:53 PM, Alex P <ap...@kolosy.com> wrote:
> a simple copy seems to work just fine. db shows up with the right seq number
> etc.

cool. totally wanna learn about the SAN roundtrip. :)

>
> On Mon, Oct 19, 2009 at 2:37 PM, Paul Davis <pa...@gmail.com>wrote:
>
>> On Mon, Oct 19, 2009 at 3:35 PM, Alex P <ap...@kolosy.com> wrote:
>> > cdb catalogs databases purely on file name and presence in the
>> appropriate
>> > directory, right? so if i copy db1 to db1_copy, i should see db1_copy
>> show
>> > up in futon?
>>
>> Yepper. Make sure the permissions are correct after copying though.
>>
>> Paul Davis
>>
>



-- 
Chris Anderson
http://jchrisa.net
http://couch.io

Re: couch behavior for san snapshots

Posted by Alex P <ap...@kolosy.com>.
a simple copy seems to work just fine. db shows up with the right seq number
etc.

On Mon, Oct 19, 2009 at 2:37 PM, Paul Davis <pa...@gmail.com>wrote:

> On Mon, Oct 19, 2009 at 3:35 PM, Alex P <ap...@kolosy.com> wrote:
> > cdb catalogs databases purely on file name and presence in the
> appropriate
> > directory, right? so if i copy db1 to db1_copy, i should see db1_copy
> show
> > up in futon?
>
> Yepper. Make sure the permissions are correct after copying though.
>
> Paul Davis
>

Re: couch behavior for san snapshots

Posted by Paul Davis <pa...@gmail.com>.
On Mon, Oct 19, 2009 at 3:35 PM, Alex P <ap...@kolosy.com> wrote:
> cdb catalogs databases purely on file name and presence in the appropriate
> directory, right? so if i copy db1 to db1_copy, i should see db1_copy show
> up in futon?

Yepper. Make sure the permissions are correct after copying though.

Paul Davis

Re: couch behavior for san snapshots

Posted by Alex P <ap...@kolosy.com>.
cdb catalogs databases purely on file name and presence in the appropriate
directory, right? so if i copy db1 to db1_copy, i should see db1_copy show
up in futon?

On Mon, Oct 19, 2009 at 2:23 PM, Paul Davis <pa...@gmail.com>wrote:

> On Mon, Oct 19, 2009 at 3:18 PM, Alex P <ap...@kolosy.com> wrote:
> > hello,
> >
> > we ran into an issue this weekend where taking a SAN (amazon ebs)
> snapshot
> > of a couchdb mount somehow brought over only part of the data.
> specifically,
> > there were two small dbs (call them db1 and db2) on the couch instance,
> each
> > with several hundred documents. taking the snapshot and bringing it up
> again
> > showed both databases, but one had 0 documents and an update sequence of
> 0.
> > the instance the snapshot was taken from still has both databases showing
> > with several hundred databases. we repeated this process several times,
> with
> > the same effect, on the same db.
> >
> > any thoughts would be much appreciated, as we are about to go live, and
> not
> > being able to do a backup of our db is ... disturbing.
> >
> > thanks,
> > alex.
> >
>
> Alex,
>
> That's most odd. What happens if you try and cp the database file to a
> new name? Something like:
>
> $ cp /usr/local/var/lib/couchdb/db1.couch
> /usr/local/var/lib/couchdb/db3.couch
>
> Another thing to try would be:
>
> $ curl -X POST http://127.0.0.1:5984/db1/_ensure_full_commit
>
> And then try to snapshot or copy again. I highly doubt that POST would
> affect anything, but it might be worth a shot.
>
> Paul Davis
>

Re: couch behavior for san snapshots

Posted by Paul Davis <pa...@gmail.com>.
On Mon, Oct 19, 2009 at 3:18 PM, Alex P <ap...@kolosy.com> wrote:
> hello,
>
> we ran into an issue this weekend where taking a SAN (amazon ebs) snapshot
> of a couchdb mount somehow brought over only part of the data. specifically,
> there were two small dbs (call them db1 and db2) on the couch instance, each
> with several hundred documents. taking the snapshot and bringing it up again
> showed both databases, but one had 0 documents and an update sequence of 0.
> the instance the snapshot was taken from still has both databases showing
> with several hundred databases. we repeated this process several times, with
> the same effect, on the same db.
>
> any thoughts would be much appreciated, as we are about to go live, and not
> being able to do a backup of our db is ... disturbing.
>
> thanks,
> alex.
>

Alex,

That's most odd. What happens if you try and cp the database file to a
new name? Something like:

$ cp /usr/local/var/lib/couchdb/db1.couch /usr/local/var/lib/couchdb/db3.couch

Another thing to try would be:

$ curl -X POST http://127.0.0.1:5984/db1/_ensure_full_commit

And then try to snapshot or copy again. I highly doubt that POST would
affect anything, but it might be worth a shot.

Paul Davis

Re: couch behavior for san snapshots

Posted by Chris Anderson <jc...@apache.org>.
On Mon, Oct 19, 2009 at 12:18 PM, Alex P <ap...@kolosy.com> wrote:
> hello,
>
> we ran into an issue this weekend where taking a SAN (amazon ebs) snapshot
> of a couchdb mount somehow brought over only part of the data. specifically,
> there were two small dbs (call them db1 and db2) on the couch instance, each
> with several hundred documents. taking the snapshot and bringing it up again
> showed both databases, but one had 0 documents and an update sequence of 0.
> the instance the snapshot was taken from still has both databases showing
> with several hundred databases. we repeated this process several times, with
> the same effect, on the same db.
>
> any thoughts would be much appreciated, as we are about to go live, and not
> being able to do a backup of our db is ... disturbing.
>

Is it possible to look for differences between the san
snapshot-restored file and the original? if you haven't compacted
since the snapshot they should be the same until the time of the
snapshot.

If the snapshot restore corrupted the file, that should show up as a
difference earlier in the file.

Chris

> thanks,
> alex.
>



-- 
Chris Anderson
http://jchrisa.net
http://couch.io