You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hawq.apache.org by Ming Li <ml...@pivotal.io> on 2016/09/05 15:41:21 UTC

Re: HAWQ standby master sync process

Hi,

The general idea please refer to PostgreSQL:
https://www.pgcon.org/2008/schedule/attachments/61_Synchronous%20Log%20Shipping%20Replication.pdf


Here just share some info about standby code.

The standby related code is here:
src/backend/postmaster/walredoserver.c
src/backend/postmaster/walsendserver.c

Global pic:
- Backend generate WAL and pass it to the forked process "WAL Sender",  the
calling stack is: XLogQDMirrorWrite() => WalSendServerClientSendRequest()

- "WAL sender" process will be forked up and loop for processing request
and response, the calling stack is:
walsendserver_forkexec() -> walsendserver_start() -> ServiceMain() ->
ServiceListenLoop() -> ServiceProcessRequest() ->
serviceConfig->ServiceRequest()
-> WalSendServer_ServiceRequest()

- "WAL Sender" send WAL to "WAL Receiver" which is on the standby node, the
calling stack is:
WalSendServer_ServiceRequest() => WalSendServerDoRequest() =>
disconnectMirrorQD_SendClose() => write_qd_sync() => PQsendQuery()

- On the standby side, all API are similar,  e.g. walredoserver_forkexec()
vs walsendserver_forkexec()

Hope it helps you! ~_~



On Thu, Aug 11, 2016 at 1:09 AM, Kyle Dunn <kd...@pivotal.io> wrote:

> Hello,
>
> I'm investigating DR options for HAWQ and was curious about the existing
> master catalog synchronization process. My question is mainly around what
> this process does at a high level and where I might look in the code base
> or management tools to see about extending it for additional standby
> masters (e.g. one in a geographically distant data center and/or different
> logical HAWQ cluster). The assumption is the HDFS blocks would be
> replicated by something like distcp via Falcon.
>
> I believe there are obvious things to address like DFS / namenode URI
> parameters, FQDNs, and certainly failure scenarios / edge cases, but I'm
> mainly trying to get a dialog started to see what input, ideas, and
> considerations others have. One thing I'm specifically interested in is
> whether / how WAL can be used (@Keaton).
>
>
> Thanks,
> Kyle
> --
> *Kyle Dunn | Data Engineering | Pivotal*
> Direct: 303.905.3171 <3039053171> | Email: kdunn@pivotal.io
>

Re: HAWQ standby master sync process

Posted by Kyle Dunn <kd...@pivotal.io>.

Hey all -

I want to follow up here with a PR around the original motivation for this
thread: HAWQ-1078 - DR via Apache Falcon / HDFS replication.

I'd like to get any feedback on the implementation or answer questions on
the overall process. I've tried to document it thoroughly, but it'd be
great if someone else was able to replicate the functionality
independently.

The PR is available here: https://github.com/apache/incubator-hawq/pull/940


Thanks,
Kyle

On Sun, Sep 18, 2016 at 9:26 PM Ming Li <ml...@pivotal.io> wrote:

> 1) The status in the gp_segment_configuration indicate that standby is in
> "synced" or "out of sync", all sync works are  automatically processed,
> except that there are some network problem, crash, out of memory problems.
> It can automatically do it as most as possible, no need to offer a SQL
> for it.
>
> 2) No. If standby 'out of sync', then even you copy full standby and xlog,
> you can not let the db redo to the latest status. Thinking that there are
> some UPDATE ongoing, how can you copy standby without shutdown it, and make
> sure all on-going UPDATEs are forward to the remote destination node?
>
>
> On Sat, Sep 17, 2016 at 2:51 AM, Kyle Dunn <kd...@pivotal.io> wrote:
>
> A couple follow-on questions that originated from a production user:
>
> 1) Is there a way to ensure a standby master is "up-to-date" with WALs,
> either via a SQL query or some other process-external way?
>
> 2) Can a full archive of the standby MASTER_DATA_DIRECTORY be used in the
> restoration of another master at the DR site (or the originating one)? I
> realize there are some "role to hostname" mappings in the catalog that
> would need to be updated, but otherwise, [how] do the active and standby
> catalogs differ? This is useful as an alternative to changing the WAL
> send/receive path in the code path but allows "snapshotting" the existing
> standby master without disturbing normal activity on the active master.
>
>
> Thanks,
> Kyle
>
>
> On Mon, Sep 12, 2016 at 9:37 PM Ming Li <ml...@pivotal.io> wrote:
>
> Yes, as Wen said, we currently don't support 2 standby nodes at the same
> time, but we can change code/design to support it after the design
> finalized.
>
> As for the master connect to 2 standby nodes directly, I think it is not
> the feasible way:
> 1) Now standby process will report 'out of sync' if the connection to
> master lost, and it can't be changed to 'synced' without re-init standby
> node. It maybe a bug or design limitation which I have not investigated.
> 2) Remote standby sync will slow down master transaction commit processing
> extremely, the responsible time will be greatly prolong, it is not
> acceptable if the network is not good and fast enough.
> 3) Master node always keep busy, it means other concurrent workload will
> slow down sync process, and also the sync process will slow down the
> throughout of the whole master cluster.
>
> Maybe more discussions or solutions are needed.  Thanks.
>
>
>
> On Tue, Sep 13, 2016 at 9:56 AM, Wen Lin <wl...@pivotal.io> wrote:
>
> Kyle,
>
> When HAWQ cluster is initialized, if a standby master is configured in
> hawq-site.xml, the HAWQ scripts will initialize standby master on one node,
> and register it into master's gp_segment_configuration table. So the master
> knows standby master from this catalog table.
> Unlike segment instance, which is register itself by sending heartbeat
> message to master, standby master has no heartbeat message.
> It's not possible to have two standby masters running together, if you
> initialize another standby master, the first one in
> gp_segment_configuration table will be removed.
>
> Regards!
>
> Wen
>
> On Tue, Sep 13, 2016 at 5:32 AM, Kyle Dunn <kd...@pivotal.io> wrote:
>
> > Hey Ming -
> >
> > Am I understanding correctly that a standby master will register
> > automagically to the active master, based on the contents of
> hawq-site.xml?
> >
> > What would happen if two different standby masters on different nodes
> both
> > tried registering with the same active master? I ask because this is the
> > exact situation that would be useful for having a passive DR site with
> HAWQ
> > installed, querying for new WALs in the same flow as a local standby.
> >
> > As for "daisy chaining" masters, which I believe is what you described in
> > (2) above: Master -> WAL -> Standby -> DR node, I think this may be less
> > desirable than multiple "normal" standby client nodes, as losing the
> > standby node becomes a cascading failure into DR.
> >
> > Anytime we can make use of the DFS available (I say DFS, rather than
> HDFS,
> > as the hope is eventually this would be S3, Azure blob, Ceph, etc) - we
> > should!  (unrelated to DR) In my mind, this includes propagating the
> > system catalog to segment nodes via the underlying DFS, rather than
> > transmitting as part of each query.
> >
> > Thank you for the helpful insight and discussion!
> >
> >
> > -Kyle
> >
> > On Thu, Sep 8, 2016 at 10:55 PM Ming Li <ml...@pivotal.io> wrote:
> >
> >> Hi Kyle,
> >>
> >> As for your question how to config standby host, when standby
> nodes(which
> >> is config in hawq-site.xml) started, it will auto registered it's info
> in
> >> the system table gp_segment_configuration(
> >> there is system table:
> >> http://hdb.docs.pivotal.io/20/reference/catalog/gp_segment_
> >> configuration.html),
> >> so that hawq can use this info internally in catalog.  if you need more
> >> details about it, @wen lin can help you.
> >>
> >> Then standby will report the LSN of WALs it synched to master node,
> master
> >> node according to this LSN to test the gap between master and node is
> >> still
> >> in xlog file or it is overwritten (because xlog file recycled). If the
> gap
> >> is not in the xlog file, we cannot do further just report "out of sync",
> >> which need to manually run hawq init standby to recreate standby node;
> >> else
> >> we just push the WAL after this LSN to standby node, and redo them. All
> >> related standby script problem can ask @radar for help.
> >>
> >> In most cases the standby should be less workload than master, so I
> >> suggestion maybe we can implement it as:
> >> (1) Master push WAL to standby node, when standby received them, it
> >> firstly
> >> write to file, then report successfully to master so that no blocking
> >> transaction commit.
> >> (2) standby node redo them on this node, and at the same time, it need
> to
> >> guarantee that the WAL should be transferred to the remote DR node, we
> can
> >> set different sync policy (whether need to guarantee WAL transferred to
> >> remote node when transaction committed ) in case of different
> transaction
> >> commit latency and different data loss acceptance at remote node.
> >>
> >> More to discussed:
> >> (1) If standby "report out of sync" and gap is not available on master
> >> node, we need to reinit standby manually, which need to shutdown master
> >> node. We need to think an stronger policy for this scenario, e.g. just
> >> push
> >> WAL to other nodes, and write as duplicate file? or we can further to
> >> write
> >> into hdfs directly?
> >> (2) If multiple master feature implemented, maybe the design need to be
> >> changed. I don't take time on it.
> >>
> >> Any comments or suggestions are welcomed. Thanks.
> >>
> >>
> >> On Fri, Sep 9, 2016 at 1:22 AM, Kyle Dunn <kd...@pivotal.io> wrote:
> >>
> >> > Ming -
> >> >
> >> > Thank you for the info, this is very helpful in understanding how WAL
> >> > shipment happens.
> >> >
> >> > One question I have is: if/where the destination host is configured in
> >> > walsendserver.c? Alternatively, does a standby master client initiate
> >> the
> >> > request rather than the active master pushing out WALs as they become
> >> > available? I ask because for a more robust DR solution than what I'm
> >> > currently working on would allow multiple standby targets (i.e. one
> >> > traditional standby, one DR mirror, etc.)
> >> >
> >> > At the moment I've opted for an approach that stops the active HAWQ
> >> master,
> >> > creates a tarball of the entire MASTER_DATA_DIRECTORY, archives it on
> >> HDFS,
> >> > then invokes distcp via Apache Falcon to mirror /hawq_default in HDFS
> to
> >> > the DR site. After a DR event there would be some manual process to
> >> restore
> >> > said archive and update the hostname / DFS references to reflect the
> >> actual
> >> > DR environment.
> >> >
> >> > This approach is a step in the right direction but the act of creating
> >> the
> >> > tarball necessitates a brief HAWQ master outage (currently ~1 minute
> >> when
> >> > excluding pg_log contents and not compressing), whereas extending the
> >> > walserver code could avoid any outage by allowing WAL replication to
> >> have
> >> > multiple destinations.
> >> >
> >> > The top-level code for orchestrating this process is currently written
> >> in
> >> > Python 2.6 compatible code - I'd like to have some review of it by the
> >> DEV
> >> > team, if possible, as a first step to a future PR for "HAWQ DR" via
> >> Falcon.
> >> >
> >> > Thoughts?
> >> >
> >> >
> >> > -Kyle
> >> >
> >> > On Mon, Sep 5, 2016 at 9:41 AM Ming Li <ml...@pivotal.io> wrote:
> >> >
> >> > > Hi,
> >> > >
> >> > > The general idea please refer to PostgreSQL:
> >> > >
> >> > > https://www.pgcon.org/2008/schedule/attachments/61_
> >> > Synchronous%20Log%20Shipping%20Replication.pdf
> >> > >
> >> > >
> >> > > Here just share some info about standby code.
> >> > >
> >> > > The standby related code is here:
> >> > > src/backend/postmaster/walredoserver.c
> >> > > src/backend/postmaster/walsendserver.c
> >> > >
> >> > > Global pic:
> >> > > - Backend generate WAL and pass it to the forked process "WAL
> Sender",
> >> > the
> >> > > calling stack is: XLogQDMirrorWrite() =>
> >> WalSendServerClientSendRequest
> >> > ()
> >> > >
> >> > > - "WAL sender" process will be forked up and loop for processing
> >> request
> >> > > and response, the calling stack is:
> >> > > walsendserver_forkexec() -> walsendserver_start() -> ServiceMain()
> ->
> >> > > ServiceListenLoop() -> ServiceProcessRequest() ->
> >> > > serviceConfig->ServiceRequest()
> >> > > -> WalSendServer_ServiceRequest()
> >> > >
> >> > > - "WAL Sender" send WAL to "WAL Receiver" which is on the standby
> >> node,
> >> > the
> >> > > calling stack is:
> >> > > WalSendServer_ServiceRequest() => WalSendServerDoRequest() =>
> >> > > disconnectMirrorQD_SendClose() => write_qd_sync() => PQsendQuery()
> >> > >
> >> > > - On the standby side, all API are similar,  e.g.
> >> > walredoserver_forkexec()
> >> > > vs walsendserver_forkexec()
> >> > >
> >> > > Hope it helps you! ~_~
> >> > >
> >> > >
> >> > >
> >> > > On Thu, Aug 11, 2016 at 1:09 AM, Kyle Dunn <kd...@pivotal.io>
> wrote:
> >> > >
> >> > > > Hello,
> >> > > >
> >> > > > I'm investigating DR options for HAWQ and was curious about the
> >> > existing
> >> > > > master catalog synchronization process. My question is mainly
> around
> >> > what
> >> > > > this process does at a high level and where I might look in the
> code
> >> > base
> >> > > > or management tools to see about extending it for additional
> standby
> >> > > > masters (e.g. one in a geographically distant data center and/or
> >> > > different
> >> > > > logical HAWQ cluster). The assumption is the HDFS blocks would be
> >> > > > replicated by something like distcp via Falcon.
> >> > > >
> >> > > > I believe there are obvious things to address like DFS / namenode
> >> URI
> >> > > > parameters, FQDNs, and certainly failure scenarios / edge cases,
> but
> >> > I'm
> >> > > > mainly trying to get a dialog started to see what input, ideas,
> and
> >> > > > considerations others have. One thing I'm specifically interested
> >> in is
> >> > > > whether / how WAL can be used (@Keaton).
> >> > > >
> >> > > >
> >> > > > Thanks,
> >> > > > Kyle
> >> > > > --
> >> > > > *Kyle Dunn | Data Engineering | Pivotal*
> >> > > > Direct: 303.905.3171 <(303)%20905-3171> <3039053171
> <(303)%20905-3171>> | Email: kdunn@pivotal.io
> >> > > >
> >> > >
> >> > --
> >> > *Kyle Dunn | Data Engineering | Pivotal*
> >> > Direct: 303.905.3171 <(303)%20905-3171> <3039053171
> <(303)%20905-3171>> | Email: kdunn@pivotal.io
> >> >
> >>
> > --
> > *Kyle Dunn | Data Engineering | Pivotal*
> > Direct: 303.905.3171 <(303)%20905-3171> <3039053171 <(303)%20905-3171>>
> | Email: kdunn@pivotal.io
> >
>
>
> --
> *Kyle Dunn | Data Engineering | Pivotal*
> Direct: 303.905.3171 <3039053171> | Email: kdunn@pivotal.io
>
>
> --
*Kyle Dunn | Data Engineering | Pivotal*
Direct: 303.905.3171 <3039053171> | Email: kdunn@pivotal.io

Re: HAWQ standby master sync process

Posted by Ming Li <ml...@pivotal.io>.

1) The status in the gp_segment_configuration indicate that standby is in
"synced" or "out of sync", all sync works are  automatically processed,
except that there are some network problem, crash, out of memory problems.
It can automatically do it as most as possible, no need to offer a SQL for
it.

2) No. If standby 'out of sync', then even you copy full standby and xlog,
you can not let the db redo to the latest status. Thinking that there are
some UPDATE ongoing, how can you copy standby without shutdown it, and make
sure all on-going UPDATEs are forward to the remote destination node?


On Sat, Sep 17, 2016 at 2:51 AM, Kyle Dunn <kd...@pivotal.io> wrote:

> A couple follow-on questions that originated from a production user:
>
> 1) Is there a way to ensure a standby master is "up-to-date" with WALs,
> either via a SQL query or some other process-external way?
>
> 2) Can a full archive of the standby MASTER_DATA_DIRECTORY be used in the
> restoration of another master at the DR site (or the originating one)? I
> realize there are some "role to hostname" mappings in the catalog that
> would need to be updated, but otherwise, [how] do the active and standby
> catalogs differ? This is useful as an alternative to changing the WAL
> send/receive path in the code path but allows "snapshotting" the existing
> standby master without disturbing normal activity on the active master.
>
>
> Thanks,
> Kyle
>
>
> On Mon, Sep 12, 2016 at 9:37 PM Ming Li <ml...@pivotal.io> wrote:
>
>> Yes, as Wen said, we currently don't support 2 standby nodes at the same
>> time, but we can change code/design to support it after the design
>> finalized.
>>
>> As for the master connect to 2 standby nodes directly, I think it is not
>> the feasible way:
>> 1) Now standby process will report 'out of sync' if the connection to
>> master lost, and it can't be changed to 'synced' without re-init standby
>> node. It maybe a bug or design limitation which I have not investigated.
>> 2) Remote standby sync will slow down master transaction commit
>> processing extremely, the responsible time will be greatly prolong, it is
>> not acceptable if the network is not good and fast enough.
>> 3) Master node always keep busy, it means other concurrent workload will
>> slow down sync process, and also the sync process will slow down the
>> throughout of the whole master cluster.
>>
>> Maybe more discussions or solutions are needed.  Thanks.
>>
>>
>>
>> On Tue, Sep 13, 2016 at 9:56 AM, Wen Lin <wl...@pivotal.io> wrote:
>>
>>> Kyle,
>>>
>>> When HAWQ cluster is initialized, if a standby master is configured in
>>> hawq-site.xml, the HAWQ scripts will initialize standby master on one
>>> node,
>>> and register it into master's gp_segment_configuration table. So the
>>> master
>>> knows standby master from this catalog table.
>>> Unlike segment instance, which is register itself by sending heartbeat
>>> message to master, standby master has no heartbeat message.
>>> It's not possible to have two standby masters running together, if you
>>> initialize another standby master, the first one in
>>> gp_segment_configuration table will be removed.
>>>
>>> Regards!
>>>
>>> Wen
>>>
>>> On Tue, Sep 13, 2016 at 5:32 AM, Kyle Dunn <kd...@pivotal.io> wrote:
>>>
>>> > Hey Ming -
>>> >
>>> > Am I understanding correctly that a standby master will register
>>> > automagically to the active master, based on the contents of
>>> hawq-site.xml?
>>> >
>>> > What would happen if two different standby masters on different nodes
>>> both
>>> > tried registering with the same active master? I ask because this is
>>> the
>>> > exact situation that would be useful for having a passive DR site with
>>> HAWQ
>>> > installed, querying for new WALs in the same flow as a local standby.
>>> >
>>> > As for "daisy chaining" masters, which I believe is what you described
>>> in
>>> > (2) above: Master -> WAL -> Standby -> DR node, I think this may be
>>> less
>>> > desirable than multiple "normal" standby client nodes, as losing the
>>> > standby node becomes a cascading failure into DR.
>>> >
>>> > Anytime we can make use of the DFS available (I say DFS, rather than
>>> HDFS,
>>> > as the hope is eventually this would be S3, Azure blob, Ceph, etc) - we
>>> > should!  (unrelated to DR) In my mind, this includes propagating the
>>> > system catalog to segment nodes via the underlying DFS, rather than
>>> > transmitting as part of each query.
>>> >
>>> > Thank you for the helpful insight and discussion!
>>> >
>>> >
>>> > -Kyle
>>> >
>>> > On Thu, Sep 8, 2016 at 10:55 PM Ming Li <ml...@pivotal.io> wrote:
>>> >
>>> >> Hi Kyle,
>>> >>
>>> >> As for your question how to config standby host, when standby
>>> nodes(which
>>> >> is config in hawq-site.xml) started, it will auto registered it's
>>> info in
>>> >> the system table gp_segment_configuration(
>>> >> there is system table:
>>> >> http://hdb.docs.pivotal.io/20/reference/catalog/gp_segment_
>>> >> configuration.html),
>>> >> so that hawq can use this info internally in catalog.  if you need
>>> more
>>> >> details about it, @wen lin can help you.
>>> >>
>>> >> Then standby will report the LSN of WALs it synched to master node,
>>> master
>>> >> node according to this LSN to test the gap between master and node is
>>> >> still
>>> >> in xlog file or it is overwritten (because xlog file recycled). If
>>> the gap
>>> >> is not in the xlog file, we cannot do further just report "out of
>>> sync",
>>> >> which need to manually run hawq init standby to recreate standby node;
>>> >> else
>>> >> we just push the WAL after this LSN to standby node, and redo them.
>>> All
>>> >> related standby script problem can ask @radar for help.
>>> >>
>>> >> In most cases the standby should be less workload than master, so I
>>> >> suggestion maybe we can implement it as:
>>> >> (1) Master push WAL to standby node, when standby received them, it
>>> >> firstly
>>> >> write to file, then report successfully to master so that no blocking
>>> >> transaction commit.
>>> >> (2) standby node redo them on this node, and at the same time, it
>>> need to
>>> >> guarantee that the WAL should be transferred to the remote DR node,
>>> we can
>>> >> set different sync policy (whether need to guarantee WAL transferred
>>> to
>>> >> remote node when transaction committed ) in case of different
>>> transaction
>>> >> commit latency and different data loss acceptance at remote node.
>>> >>
>>> >> More to discussed:
>>> >> (1) If standby "report out of sync" and gap is not available on master
>>> >> node, we need to reinit standby manually, which need to shutdown
>>> master
>>> >> node. We need to think an stronger policy for this scenario, e.g. just
>>> >> push
>>> >> WAL to other nodes, and write as duplicate file? or we can further to
>>> >> write
>>> >> into hdfs directly?
>>> >> (2) If multiple master feature implemented, maybe the design need to
>>> be
>>> >> changed. I don't take time on it.
>>> >>
>>> >> Any comments or suggestions are welcomed. Thanks.
>>> >>
>>> >>
>>> >> On Fri, Sep 9, 2016 at 1:22 AM, Kyle Dunn <kd...@pivotal.io> wrote:
>>> >>
>>> >> > Ming -
>>> >> >
>>> >> > Thank you for the info, this is very helpful in understanding how
>>> WAL
>>> >> > shipment happens.
>>> >> >
>>> >> > One question I have is: if/where the destination host is configured
>>> in
>>> >> > walsendserver.c? Alternatively, does a standby master client
>>> initiate
>>> >> the
>>> >> > request rather than the active master pushing out WALs as they
>>> become
>>> >> > available? I ask because for a more robust DR solution than what I'm
>>> >> > currently working on would allow multiple standby targets (i.e. one
>>> >> > traditional standby, one DR mirror, etc.)
>>> >> >
>>> >> > At the moment I've opted for an approach that stops the active HAWQ
>>> >> master,
>>> >> > creates a tarball of the entire MASTER_DATA_DIRECTORY, archives it
>>> on
>>> >> HDFS,
>>> >> > then invokes distcp via Apache Falcon to mirror /hawq_default in
>>> HDFS to
>>> >> > the DR site. After a DR event there would be some manual process to
>>> >> restore
>>> >> > said archive and update the hostname / DFS references to reflect the
>>> >> actual
>>> >> > DR environment.
>>> >> >
>>> >> > This approach is a step in the right direction but the act of
>>> creating
>>> >> the
>>> >> > tarball necessitates a brief HAWQ master outage (currently ~1 minute
>>> >> when
>>> >> > excluding pg_log contents and not compressing), whereas extending
>>> the
>>> >> > walserver code could avoid any outage by allowing WAL replication to
>>> >> have
>>> >> > multiple destinations.
>>> >> >
>>> >> > The top-level code for orchestrating this process is currently
>>> written
>>> >> in
>>> >> > Python 2.6 compatible code - I'd like to have some review of it by
>>> the
>>> >> DEV
>>> >> > team, if possible, as a first step to a future PR for "HAWQ DR" via
>>> >> Falcon.
>>> >> >
>>> >> > Thoughts?
>>> >> >
>>> >> >
>>> >> > -Kyle
>>> >> >
>>> >> > On Mon, Sep 5, 2016 at 9:41 AM Ming Li <ml...@pivotal.io> wrote:
>>> >> >
>>> >> > > Hi,
>>> >> > >
>>> >> > > The general idea please refer to PostgreSQL:
>>> >> > >
>>> >> > > https://www.pgcon.org/2008/schedule/attachments/61_
>>> >> > Synchronous%20Log%20Shipping%20Replication.pdf
>>> >> > >
>>> >> > >
>>> >> > > Here just share some info about standby code.
>>> >> > >
>>> >> > > The standby related code is here:
>>> >> > > src/backend/postmaster/walredoserver.c
>>> >> > > src/backend/postmaster/walsendserver.c
>>> >> > >
>>> >> > > Global pic:
>>> >> > > - Backend generate WAL and pass it to the forked process "WAL
>>> Sender",
>>> >> > the
>>> >> > > calling stack is: XLogQDMirrorWrite() =>
>>> >> WalSendServerClientSendRequest
>>> >> > ()
>>> >> > >
>>> >> > > - "WAL sender" process will be forked up and loop for processing
>>> >> request
>>> >> > > and response, the calling stack is:
>>> >> > > walsendserver_forkexec() -> walsendserver_start() ->
>>> ServiceMain() ->
>>> >> > > ServiceListenLoop() -> ServiceProcessRequest() ->
>>> >> > > serviceConfig->ServiceRequest()
>>> >> > > -> WalSendServer_ServiceRequest()
>>> >> > >
>>> >> > > - "WAL Sender" send WAL to "WAL Receiver" which is on the standby
>>> >> node,
>>> >> > the
>>> >> > > calling stack is:
>>> >> > > WalSendServer_ServiceRequest() => WalSendServerDoRequest() =>
>>> >> > > disconnectMirrorQD_SendClose() => write_qd_sync() => PQsendQuery()
>>> >> > >
>>> >> > > - On the standby side, all API are similar,  e.g.
>>> >> > walredoserver_forkexec()
>>> >> > > vs walsendserver_forkexec()
>>> >> > >
>>> >> > > Hope it helps you! ~_~
>>> >> > >
>>> >> > >
>>> >> > >
>>> >> > > On Thu, Aug 11, 2016 at 1:09 AM, Kyle Dunn <kd...@pivotal.io>
>>> wrote:
>>> >> > >
>>> >> > > > Hello,
>>> >> > > >
>>> >> > > > I'm investigating DR options for HAWQ and was curious about the
>>> >> > existing
>>> >> > > > master catalog synchronization process. My question is mainly
>>> around
>>> >> > what
>>> >> > > > this process does at a high level and where I might look in the
>>> code
>>> >> > base
>>> >> > > > or management tools to see about extending it for additional
>>> standby
>>> >> > > > masters (e.g. one in a geographically distant data center and/or
>>> >> > > different
>>> >> > > > logical HAWQ cluster). The assumption is the HDFS blocks would
>>> be
>>> >> > > > replicated by something like distcp via Falcon.
>>> >> > > >
>>> >> > > > I believe there are obvious things to address like DFS /
>>> namenode
>>> >> URI
>>> >> > > > parameters, FQDNs, and certainly failure scenarios / edge
>>> cases, but
>>> >> > I'm
>>> >> > > > mainly trying to get a dialog started to see what input, ideas,
>>> and
>>> >> > > > considerations others have. One thing I'm specifically
>>> interested
>>> >> in is
>>> >> > > > whether / how WAL can be used (@Keaton).
>>> >> > > >
>>> >> > > >
>>> >> > > > Thanks,
>>> >> > > > Kyle
>>> >> > > > --
>>> >> > > > *Kyle Dunn | Data Engineering | Pivotal*
>>> >> > > > Direct: 303.905.3171 <3039053171> | Email: kdunn@pivotal.io
>>> >> > > >
>>> >> > >
>>> >> > --
>>> >> > *Kyle Dunn | Data Engineering | Pivotal*
>>> >> > Direct: 303.905.3171 <3039053171> | Email: kdunn@pivotal.io
>>> >> >
>>> >>
>>> > --
>>> > *Kyle Dunn | Data Engineering | Pivotal*
>>> > Direct: 303.905.3171 <3039053171> | Email: kdunn@pivotal.io
>>> >
>>>
>>
>> --
> *Kyle Dunn | Data Engineering | Pivotal*
> Direct: 303.905.3171 <3039053171> | Email: kdunn@pivotal.io
>

Re: HAWQ standby master sync process

Posted by Kyle Dunn <kd...@pivotal.io>.

A couple follow-on questions that originated from a production user:

1) Is there a way to ensure a standby master is "up-to-date" with WALs,
either via a SQL query or some other process-external way?

2) Can a full archive of the standby MASTER_DATA_DIRECTORY be used in the
restoration of another master at the DR site (or the originating one)? I
realize there are some "role to hostname" mappings in the catalog that
would need to be updated, but otherwise, [how] do the active and standby
catalogs differ? This is useful as an alternative to changing the WAL
send/receive path in the code path but allows "snapshotting" the existing
standby master without disturbing normal activity on the active master.


Thanks,
Kyle

On Mon, Sep 12, 2016 at 9:37 PM Ming Li <ml...@pivotal.io> wrote:

> Yes, as Wen said, we currently don't support 2 standby nodes at the same
> time, but we can change code/design to support it after the design
> finalized.
>
> As for the master connect to 2 standby nodes directly, I think it is not
> the feasible way:
> 1) Now standby process will report 'out of sync' if the connection to
> master lost, and it can't be changed to 'synced' without re-init standby
> node. It maybe a bug or design limitation which I have not investigated.
> 2) Remote standby sync will slow down master transaction commit processing
> extremely, the responsible time will be greatly prolong, it is not
> acceptable if the network is not good and fast enough.
> 3) Master node always keep busy, it means other concurrent workload will
> slow down sync process, and also the sync process will slow down the
> throughout of the whole master cluster.
>
> Maybe more discussions or solutions are needed.  Thanks.
>
>
>
> On Tue, Sep 13, 2016 at 9:56 AM, Wen Lin <wl...@pivotal.io> wrote:
>
>> Kyle,
>>
>> When HAWQ cluster is initialized, if a standby master is configured in
>> hawq-site.xml, the HAWQ scripts will initialize standby master on one
>> node,
>> and register it into master's gp_segment_configuration table. So the
>> master
>> knows standby master from this catalog table.
>> Unlike segment instance, which is register itself by sending heartbeat
>> message to master, standby master has no heartbeat message.
>> It's not possible to have two standby masters running together, if you
>> initialize another standby master, the first one in
>> gp_segment_configuration table will be removed.
>>
>> Regards!
>>
>> Wen
>>
>> On Tue, Sep 13, 2016 at 5:32 AM, Kyle Dunn <kd...@pivotal.io> wrote:
>>
>> > Hey Ming -
>> >
>> > Am I understanding correctly that a standby master will register
>> > automagically to the active master, based on the contents of
>> hawq-site.xml?
>> >
>> > What would happen if two different standby masters on different nodes
>> both
>> > tried registering with the same active master? I ask because this is the
>> > exact situation that would be useful for having a passive DR site with
>> HAWQ
>> > installed, querying for new WALs in the same flow as a local standby.
>> >
>> > As for "daisy chaining" masters, which I believe is what you described
>> in
>> > (2) above: Master -> WAL -> Standby -> DR node, I think this may be less
>> > desirable than multiple "normal" standby client nodes, as losing the
>> > standby node becomes a cascading failure into DR.
>> >
>> > Anytime we can make use of the DFS available (I say DFS, rather than
>> HDFS,
>> > as the hope is eventually this would be S3, Azure blob, Ceph, etc) - we
>> > should!  (unrelated to DR) In my mind, this includes propagating the
>> > system catalog to segment nodes via the underlying DFS, rather than
>> > transmitting as part of each query.
>> >
>> > Thank you for the helpful insight and discussion!
>> >
>> >
>> > -Kyle
>> >
>> > On Thu, Sep 8, 2016 at 10:55 PM Ming Li <ml...@pivotal.io> wrote:
>> >
>> >> Hi Kyle,
>> >>
>> >> As for your question how to config standby host, when standby
>> nodes(which
>> >> is config in hawq-site.xml) started, it will auto registered it's info
>> in
>> >> the system table gp_segment_configuration(
>> >> there is system table:
>> >> http://hdb.docs.pivotal.io/20/reference/catalog/gp_segment_
>> >> configuration.html),
>> >> so that hawq can use this info internally in catalog.  if you need more
>> >> details about it, @wen lin can help you.
>> >>
>> >> Then standby will report the LSN of WALs it synched to master node,
>> master
>> >> node according to this LSN to test the gap between master and node is
>> >> still
>> >> in xlog file or it is overwritten (because xlog file recycled). If the
>> gap
>> >> is not in the xlog file, we cannot do further just report "out of
>> sync",
>> >> which need to manually run hawq init standby to recreate standby node;
>> >> else
>> >> we just push the WAL after this LSN to standby node, and redo them. All
>> >> related standby script problem can ask @radar for help.
>> >>
>> >> In most cases the standby should be less workload than master, so I
>> >> suggestion maybe we can implement it as:
>> >> (1) Master push WAL to standby node, when standby received them, it
>> >> firstly
>> >> write to file, then report successfully to master so that no blocking
>> >> transaction commit.
>> >> (2) standby node redo them on this node, and at the same time, it need
>> to
>> >> guarantee that the WAL should be transferred to the remote DR node, we
>> can
>> >> set different sync policy (whether need to guarantee WAL transferred to
>> >> remote node when transaction committed ) in case of different
>> transaction
>> >> commit latency and different data loss acceptance at remote node.
>> >>
>> >> More to discussed:
>> >> (1) If standby "report out of sync" and gap is not available on master
>> >> node, we need to reinit standby manually, which need to shutdown master
>> >> node. We need to think an stronger policy for this scenario, e.g. just
>> >> push
>> >> WAL to other nodes, and write as duplicate file? or we can further to
>> >> write
>> >> into hdfs directly?
>> >> (2) If multiple master feature implemented, maybe the design need to be
>> >> changed. I don't take time on it.
>> >>
>> >> Any comments or suggestions are welcomed. Thanks.
>> >>
>> >>
>> >> On Fri, Sep 9, 2016 at 1:22 AM, Kyle Dunn <kd...@pivotal.io> wrote:
>> >>
>> >> > Ming -
>> >> >
>> >> > Thank you for the info, this is very helpful in understanding how WAL
>> >> > shipment happens.
>> >> >
>> >> > One question I have is: if/where the destination host is configured
>> in
>> >> > walsendserver.c? Alternatively, does a standby master client initiate
>> >> the
>> >> > request rather than the active master pushing out WALs as they become
>> >> > available? I ask because for a more robust DR solution than what I'm
>> >> > currently working on would allow multiple standby targets (i.e. one
>> >> > traditional standby, one DR mirror, etc.)
>> >> >
>> >> > At the moment I've opted for an approach that stops the active HAWQ
>> >> master,
>> >> > creates a tarball of the entire MASTER_DATA_DIRECTORY, archives it on
>> >> HDFS,
>> >> > then invokes distcp via Apache Falcon to mirror /hawq_default in
>> HDFS to
>> >> > the DR site. After a DR event there would be some manual process to
>> >> restore
>> >> > said archive and update the hostname / DFS references to reflect the
>> >> actual
>> >> > DR environment.
>> >> >
>> >> > This approach is a step in the right direction but the act of
>> creating
>> >> the
>> >> > tarball necessitates a brief HAWQ master outage (currently ~1 minute
>> >> when
>> >> > excluding pg_log contents and not compressing), whereas extending the
>> >> > walserver code could avoid any outage by allowing WAL replication to
>> >> have
>> >> > multiple destinations.
>> >> >
>> >> > The top-level code for orchestrating this process is currently
>> written
>> >> in
>> >> > Python 2.6 compatible code - I'd like to have some review of it by
>> the
>> >> DEV
>> >> > team, if possible, as a first step to a future PR for "HAWQ DR" via
>> >> Falcon.
>> >> >
>> >> > Thoughts?
>> >> >
>> >> >
>> >> > -Kyle
>> >> >
>> >> > On Mon, Sep 5, 2016 at 9:41 AM Ming Li <ml...@pivotal.io> wrote:
>> >> >
>> >> > > Hi,
>> >> > >
>> >> > > The general idea please refer to PostgreSQL:
>> >> > >
>> >> > > https://www.pgcon.org/2008/schedule/attachments/61_
>> >> > Synchronous%20Log%20Shipping%20Replication.pdf
>> >> > >
>> >> > >
>> >> > > Here just share some info about standby code.
>> >> > >
>> >> > > The standby related code is here:
>> >> > > src/backend/postmaster/walredoserver.c
>> >> > > src/backend/postmaster/walsendserver.c
>> >> > >
>> >> > > Global pic:
>> >> > > - Backend generate WAL and pass it to the forked process "WAL
>> Sender",
>> >> > the
>> >> > > calling stack is: XLogQDMirrorWrite() =>
>> >> WalSendServerClientSendRequest
>> >> > ()
>> >> > >
>> >> > > - "WAL sender" process will be forked up and loop for processing
>> >> request
>> >> > > and response, the calling stack is:
>> >> > > walsendserver_forkexec() -> walsendserver_start() -> ServiceMain()
>> ->
>> >> > > ServiceListenLoop() -> ServiceProcessRequest() ->
>> >> > > serviceConfig->ServiceRequest()
>> >> > > -> WalSendServer_ServiceRequest()
>> >> > >
>> >> > > - "WAL Sender" send WAL to "WAL Receiver" which is on the standby
>> >> node,
>> >> > the
>> >> > > calling stack is:
>> >> > > WalSendServer_ServiceRequest() => WalSendServerDoRequest() =>
>> >> > > disconnectMirrorQD_SendClose() => write_qd_sync() => PQsendQuery()
>> >> > >
>> >> > > - On the standby side, all API are similar,  e.g.
>> >> > walredoserver_forkexec()
>> >> > > vs walsendserver_forkexec()
>> >> > >
>> >> > > Hope it helps you! ~_~
>> >> > >
>> >> > >
>> >> > >
>> >> > > On Thu, Aug 11, 2016 at 1:09 AM, Kyle Dunn <kd...@pivotal.io>
>> wrote:
>> >> > >
>> >> > > > Hello,
>> >> > > >
>> >> > > > I'm investigating DR options for HAWQ and was curious about the
>> >> > existing
>> >> > > > master catalog synchronization process. My question is mainly
>> around
>> >> > what
>> >> > > > this process does at a high level and where I might look in the
>> code
>> >> > base
>> >> > > > or management tools to see about extending it for additional
>> standby
>> >> > > > masters (e.g. one in a geographically distant data center and/or
>> >> > > different
>> >> > > > logical HAWQ cluster). The assumption is the HDFS blocks would be
>> >> > > > replicated by something like distcp via Falcon.
>> >> > > >
>> >> > > > I believe there are obvious things to address like DFS / namenode
>> >> URI
>> >> > > > parameters, FQDNs, and certainly failure scenarios / edge cases,
>> but
>> >> > I'm
>> >> > > > mainly trying to get a dialog started to see what input, ideas,
>> and
>> >> > > > considerations others have. One thing I'm specifically interested
>> >> in is
>> >> > > > whether / how WAL can be used (@Keaton).
>> >> > > >
>> >> > > >
>> >> > > > Thanks,
>> >> > > > Kyle
>> >> > > > --
>> >> > > > *Kyle Dunn | Data Engineering | Pivotal*
>> >> > > > Direct: 303.905.3171 <3039053171> | Email: kdunn@pivotal.io
>> >> > > >
>> >> > >
>> >> > --
>> >> > *Kyle Dunn | Data Engineering | Pivotal*
>> >> > Direct: 303.905.3171 <3039053171> | Email: kdunn@pivotal.io
>> >> >
>> >>
>> > --
>> > *Kyle Dunn | Data Engineering | Pivotal*
>> > Direct: 303.905.3171 <3039053171> | Email: kdunn@pivotal.io
>> >
>>
>
> --
*Kyle Dunn | Data Engineering | Pivotal*
Direct: 303.905.3171 <3039053171> | Email: kdunn@pivotal.io

Re: HAWQ standby master sync process

Posted by Ming Li <ml...@pivotal.io>.

Yes, as Wen said, we currently don't support 2 standby nodes at the same
time, but we can change code/design to support it after the design
finalized.

As for the master connect to 2 standby nodes directly, I think it is not
the feasible way:
1) Now standby process will report 'out of sync' if the connection to
master lost, and it can't be changed to 'synced' without re-init standby
node. It maybe a bug or design limitation which I have not investigated.
2) Remote standby sync will slow down master transaction commit processing
extremely, the responsible time will be greatly prolong, it is not
acceptable if the network is not good and fast enough.
3) Master node always keep busy, it means other concurrent workload will
slow down sync process, and also the sync process will slow down the
throughout of the whole master cluster.

Maybe more discussions or solutions are needed.  Thanks.



On Tue, Sep 13, 2016 at 9:56 AM, Wen Lin <wl...@pivotal.io> wrote:

> Kyle,
>
> When HAWQ cluster is initialized, if a standby master is configured in
> hawq-site.xml, the HAWQ scripts will initialize standby master on one node,
> and register it into master's gp_segment_configuration table. So the master
> knows standby master from this catalog table.
> Unlike segment instance, which is register itself by sending heartbeat
> message to master, standby master has no heartbeat message.
> It's not possible to have two standby masters running together, if you
> initialize another standby master, the first one in
> gp_segment_configuration table will be removed.
>
> Regards!
>
> Wen
>
> On Tue, Sep 13, 2016 at 5:32 AM, Kyle Dunn <kd...@pivotal.io> wrote:
>
> > Hey Ming -
> >
> > Am I understanding correctly that a standby master will register
> > automagically to the active master, based on the contents of
> hawq-site.xml?
> >
> > What would happen if two different standby masters on different nodes
> both
> > tried registering with the same active master? I ask because this is the
> > exact situation that would be useful for having a passive DR site with
> HAWQ
> > installed, querying for new WALs in the same flow as a local standby.
> >
> > As for "daisy chaining" masters, which I believe is what you described in
> > (2) above: Master -> WAL -> Standby -> DR node, I think this may be less
> > desirable than multiple "normal" standby client nodes, as losing the
> > standby node becomes a cascading failure into DR.
> >
> > Anytime we can make use of the DFS available (I say DFS, rather than
> HDFS,
> > as the hope is eventually this would be S3, Azure blob, Ceph, etc) - we
> > should!  (unrelated to DR) In my mind, this includes propagating the
> > system catalog to segment nodes via the underlying DFS, rather than
> > transmitting as part of each query.
> >
> > Thank you for the helpful insight and discussion!
> >
> >
> > -Kyle
> >
> > On Thu, Sep 8, 2016 at 10:55 PM Ming Li <ml...@pivotal.io> wrote:
> >
> >> Hi Kyle,
> >>
> >> As for your question how to config standby host, when standby
> nodes(which
> >> is config in hawq-site.xml) started, it will auto registered it's info
> in
> >> the system table gp_segment_configuration(
> >> there is system table:
> >> http://hdb.docs.pivotal.io/20/reference/catalog/gp_segment_
> >> configuration.html),
> >> so that hawq can use this info internally in catalog.  if you need more
> >> details about it, @wen lin can help you.
> >>
> >> Then standby will report the LSN of WALs it synched to master node,
> master
> >> node according to this LSN to test the gap between master and node is
> >> still
> >> in xlog file or it is overwritten (because xlog file recycled). If the
> gap
> >> is not in the xlog file, we cannot do further just report "out of sync",
> >> which need to manually run hawq init standby to recreate standby node;
> >> else
> >> we just push the WAL after this LSN to standby node, and redo them. All
> >> related standby script problem can ask @radar for help.
> >>
> >> In most cases the standby should be less workload than master, so I
> >> suggestion maybe we can implement it as:
> >> (1) Master push WAL to standby node, when standby received them, it
> >> firstly
> >> write to file, then report successfully to master so that no blocking
> >> transaction commit.
> >> (2) standby node redo them on this node, and at the same time, it need
> to
> >> guarantee that the WAL should be transferred to the remote DR node, we
> can
> >> set different sync policy (whether need to guarantee WAL transferred to
> >> remote node when transaction committed ) in case of different
> transaction
> >> commit latency and different data loss acceptance at remote node.
> >>
> >> More to discussed:
> >> (1) If standby "report out of sync" and gap is not available on master
> >> node, we need to reinit standby manually, which need to shutdown master
> >> node. We need to think an stronger policy for this scenario, e.g. just
> >> push
> >> WAL to other nodes, and write as duplicate file? or we can further to
> >> write
> >> into hdfs directly?
> >> (2) If multiple master feature implemented, maybe the design need to be
> >> changed. I don't take time on it.
> >>
> >> Any comments or suggestions are welcomed. Thanks.
> >>
> >>
> >> On Fri, Sep 9, 2016 at 1:22 AM, Kyle Dunn <kd...@pivotal.io> wrote:
> >>
> >> > Ming -
> >> >
> >> > Thank you for the info, this is very helpful in understanding how WAL
> >> > shipment happens.
> >> >
> >> > One question I have is: if/where the destination host is configured in
> >> > walsendserver.c? Alternatively, does a standby master client initiate
> >> the
> >> > request rather than the active master pushing out WALs as they become
> >> > available? I ask because for a more robust DR solution than what I'm
> >> > currently working on would allow multiple standby targets (i.e. one
> >> > traditional standby, one DR mirror, etc.)
> >> >
> >> > At the moment I've opted for an approach that stops the active HAWQ
> >> master,
> >> > creates a tarball of the entire MASTER_DATA_DIRECTORY, archives it on
> >> HDFS,
> >> > then invokes distcp via Apache Falcon to mirror /hawq_default in HDFS
> to
> >> > the DR site. After a DR event there would be some manual process to
> >> restore
> >> > said archive and update the hostname / DFS references to reflect the
> >> actual
> >> > DR environment.
> >> >
> >> > This approach is a step in the right direction but the act of creating
> >> the
> >> > tarball necessitates a brief HAWQ master outage (currently ~1 minute
> >> when
> >> > excluding pg_log contents and not compressing), whereas extending the
> >> > walserver code could avoid any outage by allowing WAL replication to
> >> have
> >> > multiple destinations.
> >> >
> >> > The top-level code for orchestrating this process is currently written
> >> in
> >> > Python 2.6 compatible code - I'd like to have some review of it by the
> >> DEV
> >> > team, if possible, as a first step to a future PR for "HAWQ DR" via
> >> Falcon.
> >> >
> >> > Thoughts?
> >> >
> >> >
> >> > -Kyle
> >> >
> >> > On Mon, Sep 5, 2016 at 9:41 AM Ming Li <ml...@pivotal.io> wrote:
> >> >
> >> > > Hi,
> >> > >
> >> > > The general idea please refer to PostgreSQL:
> >> > >
> >> > > https://www.pgcon.org/2008/schedule/attachments/61_
> >> > Synchronous%20Log%20Shipping%20Replication.pdf
> >> > >
> >> > >
> >> > > Here just share some info about standby code.
> >> > >
> >> > > The standby related code is here:
> >> > > src/backend/postmaster/walredoserver.c
> >> > > src/backend/postmaster/walsendserver.c
> >> > >
> >> > > Global pic:
> >> > > - Backend generate WAL and pass it to the forked process "WAL
> Sender",
> >> > the
> >> > > calling stack is: XLogQDMirrorWrite() =>
> >> WalSendServerClientSendRequest
> >> > ()
> >> > >
> >> > > - "WAL sender" process will be forked up and loop for processing
> >> request
> >> > > and response, the calling stack is:
> >> > > walsendserver_forkexec() -> walsendserver_start() -> ServiceMain()
> ->
> >> > > ServiceListenLoop() -> ServiceProcessRequest() ->
> >> > > serviceConfig->ServiceRequest()
> >> > > -> WalSendServer_ServiceRequest()
> >> > >
> >> > > - "WAL Sender" send WAL to "WAL Receiver" which is on the standby
> >> node,
> >> > the
> >> > > calling stack is:
> >> > > WalSendServer_ServiceRequest() => WalSendServerDoRequest() =>
> >> > > disconnectMirrorQD_SendClose() => write_qd_sync() => PQsendQuery()
> >> > >
> >> > > - On the standby side, all API are similar,  e.g.
> >> > walredoserver_forkexec()
> >> > > vs walsendserver_forkexec()
> >> > >
> >> > > Hope it helps you! ~_~
> >> > >
> >> > >
> >> > >
> >> > > On Thu, Aug 11, 2016 at 1:09 AM, Kyle Dunn <kd...@pivotal.io>
> wrote:
> >> > >
> >> > > > Hello,
> >> > > >
> >> > > > I'm investigating DR options for HAWQ and was curious about the
> >> > existing
> >> > > > master catalog synchronization process. My question is mainly
> around
> >> > what
> >> > > > this process does at a high level and where I might look in the
> code
> >> > base
> >> > > > or management tools to see about extending it for additional
> standby
> >> > > > masters (e.g. one in a geographically distant data center and/or
> >> > > different
> >> > > > logical HAWQ cluster). The assumption is the HDFS blocks would be
> >> > > > replicated by something like distcp via Falcon.
> >> > > >
> >> > > > I believe there are obvious things to address like DFS / namenode
> >> URI
> >> > > > parameters, FQDNs, and certainly failure scenarios / edge cases,
> but
> >> > I'm
> >> > > > mainly trying to get a dialog started to see what input, ideas,
> and
> >> > > > considerations others have. One thing I'm specifically interested
> >> in is
> >> > > > whether / how WAL can be used (@Keaton).
> >> > > >
> >> > > >
> >> > > > Thanks,
> >> > > > Kyle
> >> > > > --
> >> > > > *Kyle Dunn | Data Engineering | Pivotal*
> >> > > > Direct: 303.905.3171 <3039053171> | Email: kdunn@pivotal.io
> >> > > >
> >> > >
> >> > --
> >> > *Kyle Dunn | Data Engineering | Pivotal*
> >> > Direct: 303.905.3171 <3039053171> | Email: kdunn@pivotal.io
> >> >
> >>
> > --
> > *Kyle Dunn | Data Engineering | Pivotal*
> > Direct: 303.905.3171 <3039053171> | Email: kdunn@pivotal.io
> >
>

Re: HAWQ standby master sync process

Posted by Wen Lin <wl...@pivotal.io>.

Kyle,

When HAWQ cluster is initialized, if a standby master is configured in
hawq-site.xml, the HAWQ scripts will initialize standby master on one node,
and register it into master's gp_segment_configuration table. So the master
knows standby master from this catalog table.
Unlike segment instance, which is register itself by sending heartbeat
message to master, standby master has no heartbeat message.
It's not possible to have two standby masters running together, if you
initialize another standby master, the first one in
gp_segment_configuration table will be removed.

Regards!

Wen

On Tue, Sep 13, 2016 at 5:32 AM, Kyle Dunn <kd...@pivotal.io> wrote:

> Hey Ming -
>
> Am I understanding correctly that a standby master will register
> automagically to the active master, based on the contents of hawq-site.xml?
>
> What would happen if two different standby masters on different nodes both
> tried registering with the same active master? I ask because this is the
> exact situation that would be useful for having a passive DR site with HAWQ
> installed, querying for new WALs in the same flow as a local standby.
>
> As for "daisy chaining" masters, which I believe is what you described in
> (2) above: Master -> WAL -> Standby -> DR node, I think this may be less
> desirable than multiple "normal" standby client nodes, as losing the
> standby node becomes a cascading failure into DR.
>
> Anytime we can make use of the DFS available (I say DFS, rather than HDFS,
> as the hope is eventually this would be S3, Azure blob, Ceph, etc) - we
> should!  (unrelated to DR) In my mind, this includes propagating the
> system catalog to segment nodes via the underlying DFS, rather than
> transmitting as part of each query.
>
> Thank you for the helpful insight and discussion!
>
>
> -Kyle
>
> On Thu, Sep 8, 2016 at 10:55 PM Ming Li <ml...@pivotal.io> wrote:
>
>> Hi Kyle,
>>
>> As for your question how to config standby host, when standby nodes(which
>> is config in hawq-site.xml) started, it will auto registered it's info in
>> the system table gp_segment_configuration(
>> there is system table:
>> http://hdb.docs.pivotal.io/20/reference/catalog/gp_segment_
>> configuration.html),
>> so that hawq can use this info internally in catalog.  if you need more
>> details about it, @wen lin can help you.
>>
>> Then standby will report the LSN of WALs it synched to master node, master
>> node according to this LSN to test the gap between master and node is
>> still
>> in xlog file or it is overwritten (because xlog file recycled). If the gap
>> is not in the xlog file, we cannot do further just report "out of sync",
>> which need to manually run hawq init standby to recreate standby node;
>> else
>> we just push the WAL after this LSN to standby node, and redo them. All
>> related standby script problem can ask @radar for help.
>>
>> In most cases the standby should be less workload than master, so I
>> suggestion maybe we can implement it as:
>> (1) Master push WAL to standby node, when standby received them, it
>> firstly
>> write to file, then report successfully to master so that no blocking
>> transaction commit.
>> (2) standby node redo them on this node, and at the same time, it need to
>> guarantee that the WAL should be transferred to the remote DR node, we can
>> set different sync policy (whether need to guarantee WAL transferred to
>> remote node when transaction committed ) in case of different transaction
>> commit latency and different data loss acceptance at remote node.
>>
>> More to discussed:
>> (1) If standby "report out of sync" and gap is not available on master
>> node, we need to reinit standby manually, which need to shutdown master
>> node. We need to think an stronger policy for this scenario, e.g. just
>> push
>> WAL to other nodes, and write as duplicate file? or we can further to
>> write
>> into hdfs directly?
>> (2) If multiple master feature implemented, maybe the design need to be
>> changed. I don't take time on it.
>>
>> Any comments or suggestions are welcomed. Thanks.
>>
>>
>> On Fri, Sep 9, 2016 at 1:22 AM, Kyle Dunn <kd...@pivotal.io> wrote:
>>
>> > Ming -
>> >
>> > Thank you for the info, this is very helpful in understanding how WAL
>> > shipment happens.
>> >
>> > One question I have is: if/where the destination host is configured in
>> > walsendserver.c? Alternatively, does a standby master client initiate
>> the
>> > request rather than the active master pushing out WALs as they become
>> > available? I ask because for a more robust DR solution than what I'm
>> > currently working on would allow multiple standby targets (i.e. one
>> > traditional standby, one DR mirror, etc.)
>> >
>> > At the moment I've opted for an approach that stops the active HAWQ
>> master,
>> > creates a tarball of the entire MASTER_DATA_DIRECTORY, archives it on
>> HDFS,
>> > then invokes distcp via Apache Falcon to mirror /hawq_default in HDFS to
>> > the DR site. After a DR event there would be some manual process to
>> restore
>> > said archive and update the hostname / DFS references to reflect the
>> actual
>> > DR environment.
>> >
>> > This approach is a step in the right direction but the act of creating
>> the
>> > tarball necessitates a brief HAWQ master outage (currently ~1 minute
>> when
>> > excluding pg_log contents and not compressing), whereas extending the
>> > walserver code could avoid any outage by allowing WAL replication to
>> have
>> > multiple destinations.
>> >
>> > The top-level code for orchestrating this process is currently written
>> in
>> > Python 2.6 compatible code - I'd like to have some review of it by the
>> DEV
>> > team, if possible, as a first step to a future PR for "HAWQ DR" via
>> Falcon.
>> >
>> > Thoughts?
>> >
>> >
>> > -Kyle
>> >
>> > On Mon, Sep 5, 2016 at 9:41 AM Ming Li <ml...@pivotal.io> wrote:
>> >
>> > > Hi,
>> > >
>> > > The general idea please refer to PostgreSQL:
>> > >
>> > > https://www.pgcon.org/2008/schedule/attachments/61_
>> > Synchronous%20Log%20Shipping%20Replication.pdf
>> > >
>> > >
>> > > Here just share some info about standby code.
>> > >
>> > > The standby related code is here:
>> > > src/backend/postmaster/walredoserver.c
>> > > src/backend/postmaster/walsendserver.c
>> > >
>> > > Global pic:
>> > > - Backend generate WAL and pass it to the forked process "WAL Sender",
>> > the
>> > > calling stack is: XLogQDMirrorWrite() =>
>> WalSendServerClientSendRequest
>> > ()
>> > >
>> > > - "WAL sender" process will be forked up and loop for processing
>> request
>> > > and response, the calling stack is:
>> > > walsendserver_forkexec() -> walsendserver_start() -> ServiceMain() ->
>> > > ServiceListenLoop() -> ServiceProcessRequest() ->
>> > > serviceConfig->ServiceRequest()
>> > > -> WalSendServer_ServiceRequest()
>> > >
>> > > - "WAL Sender" send WAL to "WAL Receiver" which is on the standby
>> node,
>> > the
>> > > calling stack is:
>> > > WalSendServer_ServiceRequest() => WalSendServerDoRequest() =>
>> > > disconnectMirrorQD_SendClose() => write_qd_sync() => PQsendQuery()
>> > >
>> > > - On the standby side, all API are similar,  e.g.
>> > walredoserver_forkexec()
>> > > vs walsendserver_forkexec()
>> > >
>> > > Hope it helps you! ~_~
>> > >
>> > >
>> > >
>> > > On Thu, Aug 11, 2016 at 1:09 AM, Kyle Dunn <kd...@pivotal.io> wrote:
>> > >
>> > > > Hello,
>> > > >
>> > > > I'm investigating DR options for HAWQ and was curious about the
>> > existing
>> > > > master catalog synchronization process. My question is mainly around
>> > what
>> > > > this process does at a high level and where I might look in the code
>> > base
>> > > > or management tools to see about extending it for additional standby
>> > > > masters (e.g. one in a geographically distant data center and/or
>> > > different
>> > > > logical HAWQ cluster). The assumption is the HDFS blocks would be
>> > > > replicated by something like distcp via Falcon.
>> > > >
>> > > > I believe there are obvious things to address like DFS / namenode
>> URI
>> > > > parameters, FQDNs, and certainly failure scenarios / edge cases, but
>> > I'm
>> > > > mainly trying to get a dialog started to see what input, ideas, and
>> > > > considerations others have. One thing I'm specifically interested
>> in is
>> > > > whether / how WAL can be used (@Keaton).
>> > > >
>> > > >
>> > > > Thanks,
>> > > > Kyle
>> > > > --
>> > > > *Kyle Dunn | Data Engineering | Pivotal*
>> > > > Direct: 303.905.3171 <3039053171> | Email: kdunn@pivotal.io
>> > > >
>> > >
>> > --
>> > *Kyle Dunn | Data Engineering | Pivotal*
>> > Direct: 303.905.3171 <3039053171> | Email: kdunn@pivotal.io
>> >
>>
> --
> *Kyle Dunn | Data Engineering | Pivotal*
> Direct: 303.905.3171 <3039053171> | Email: kdunn@pivotal.io
>

Re: HAWQ standby master sync process

Posted by Kyle Dunn <kd...@pivotal.io>.

Hey Ming -

Am I understanding correctly that a standby master will register
automagically to the active master, based on the contents of hawq-site.xml?

What would happen if two different standby masters on different nodes both
tried registering with the same active master? I ask because this is the
exact situation that would be useful for having a passive DR site with HAWQ
installed, querying for new WALs in the same flow as a local standby.

As for "daisy chaining" masters, which I believe is what you described in
(2) above: Master -> WAL -> Standby -> DR node, I think this may be less
desirable than multiple "normal" standby client nodes, as losing the
standby node becomes a cascading failure into DR.

Anytime we can make use of the DFS available (I say DFS, rather than HDFS,
as the hope is eventually this would be S3, Azure blob, Ceph, etc) - we
should!  (unrelated to DR) In my mind, this includes propagating the system
catalog to segment nodes via the underlying DFS, rather than transmitting
as part of each query.

Thank you for the helpful insight and discussion!


-Kyle

On Thu, Sep 8, 2016 at 10:55 PM Ming Li <ml...@pivotal.io> wrote:

> Hi Kyle,
>
> As for your question how to config standby host, when standby nodes(which
> is config in hawq-site.xml) started, it will auto registered it's info in
> the system table gp_segment_configuration(
> there is system table:
>
> http://hdb.docs.pivotal.io/20/reference/catalog/gp_segment_configuration.html
> ),
> so that hawq can use this info internally in catalog.  if you need more
> details about it, @wen lin can help you.
>
> Then standby will report the LSN of WALs it synched to master node, master
> node according to this LSN to test the gap between master and node is still
> in xlog file or it is overwritten (because xlog file recycled). If the gap
> is not in the xlog file, we cannot do further just report "out of sync",
> which need to manually run hawq init standby to recreate standby node; else
> we just push the WAL after this LSN to standby node, and redo them. All
> related standby script problem can ask @radar for help.
>
> In most cases the standby should be less workload than master, so I
> suggestion maybe we can implement it as:
> (1) Master push WAL to standby node, when standby received them, it firstly
> write to file, then report successfully to master so that no blocking
> transaction commit.
> (2) standby node redo them on this node, and at the same time, it need to
> guarantee that the WAL should be transferred to the remote DR node, we can
> set different sync policy (whether need to guarantee WAL transferred to
> remote node when transaction committed ) in case of different transaction
> commit latency and different data loss acceptance at remote node.
>
> More to discussed:
> (1) If standby "report out of sync" and gap is not available on master
> node, we need to reinit standby manually, which need to shutdown master
> node. We need to think an stronger policy for this scenario, e.g. just push
> WAL to other nodes, and write as duplicate file? or we can further to write
> into hdfs directly?
> (2) If multiple master feature implemented, maybe the design need to be
> changed. I don't take time on it.
>
> Any comments or suggestions are welcomed. Thanks.
>
>
> On Fri, Sep 9, 2016 at 1:22 AM, Kyle Dunn <kd...@pivotal.io> wrote:
>
> > Ming -
> >
> > Thank you for the info, this is very helpful in understanding how WAL
> > shipment happens.
> >
> > One question I have is: if/where the destination host is configured in
> > walsendserver.c? Alternatively, does a standby master client initiate the
> > request rather than the active master pushing out WALs as they become
> > available? I ask because for a more robust DR solution than what I'm
> > currently working on would allow multiple standby targets (i.e. one
> > traditional standby, one DR mirror, etc.)
> >
> > At the moment I've opted for an approach that stops the active HAWQ
> master,
> > creates a tarball of the entire MASTER_DATA_DIRECTORY, archives it on
> HDFS,
> > then invokes distcp via Apache Falcon to mirror /hawq_default in HDFS to
> > the DR site. After a DR event there would be some manual process to
> restore
> > said archive and update the hostname / DFS references to reflect the
> actual
> > DR environment.
> >
> > This approach is a step in the right direction but the act of creating
> the
> > tarball necessitates a brief HAWQ master outage (currently ~1 minute when
> > excluding pg_log contents and not compressing), whereas extending the
> > walserver code could avoid any outage by allowing WAL replication to have
> > multiple destinations.
> >
> > The top-level code for orchestrating this process is currently written in
> > Python 2.6 compatible code - I'd like to have some review of it by the
> DEV
> > team, if possible, as a first step to a future PR for "HAWQ DR" via
> Falcon.
> >
> > Thoughts?
> >
> >
> > -Kyle
> >
> > On Mon, Sep 5, 2016 at 9:41 AM Ming Li <ml...@pivotal.io> wrote:
> >
> > > Hi,
> > >
> > > The general idea please refer to PostgreSQL:
> > >
> > > https://www.pgcon.org/2008/schedule/attachments/61_
> > Synchronous%20Log%20Shipping%20Replication.pdf
> > >
> > >
> > > Here just share some info about standby code.
> > >
> > > The standby related code is here:
> > > src/backend/postmaster/walredoserver.c
> > > src/backend/postmaster/walsendserver.c
> > >
> > > Global pic:
> > > - Backend generate WAL and pass it to the forked process "WAL Sender",
> > the
> > > calling stack is: XLogQDMirrorWrite() => WalSendServerClientSendRequest
> > ()
> > >
> > > - "WAL sender" process will be forked up and loop for processing
> request
> > > and response, the calling stack is:
> > > walsendserver_forkexec() -> walsendserver_start() -> ServiceMain() ->
> > > ServiceListenLoop() -> ServiceProcessRequest() ->
> > > serviceConfig->ServiceRequest()
> > > -> WalSendServer_ServiceRequest()
> > >
> > > - "WAL Sender" send WAL to "WAL Receiver" which is on the standby node,
> > the
> > > calling stack is:
> > > WalSendServer_ServiceRequest() => WalSendServerDoRequest() =>
> > > disconnectMirrorQD_SendClose() => write_qd_sync() => PQsendQuery()
> > >
> > > - On the standby side, all API are similar,  e.g.
> > walredoserver_forkexec()
> > > vs walsendserver_forkexec()
> > >
> > > Hope it helps you! ~_~
> > >
> > >
> > >
> > > On Thu, Aug 11, 2016 at 1:09 AM, Kyle Dunn <kd...@pivotal.io> wrote:
> > >
> > > > Hello,
> > > >
> > > > I'm investigating DR options for HAWQ and was curious about the
> > existing
> > > > master catalog synchronization process. My question is mainly around
> > what
> > > > this process does at a high level and where I might look in the code
> > base
> > > > or management tools to see about extending it for additional standby
> > > > masters (e.g. one in a geographically distant data center and/or
> > > different
> > > > logical HAWQ cluster). The assumption is the HDFS blocks would be
> > > > replicated by something like distcp via Falcon.
> > > >
> > > > I believe there are obvious things to address like DFS / namenode URI
> > > > parameters, FQDNs, and certainly failure scenarios / edge cases, but
> > I'm
> > > > mainly trying to get a dialog started to see what input, ideas, and
> > > > considerations others have. One thing I'm specifically interested in
> is
> > > > whether / how WAL can be used (@Keaton).
> > > >
> > > >
> > > > Thanks,
> > > > Kyle
> > > > --
> > > > *Kyle Dunn | Data Engineering | Pivotal*
> > > > Direct: 303.905.3171 <3039053171> | Email: kdunn@pivotal.io
> > > >
> > >
> > --
> > *Kyle Dunn | Data Engineering | Pivotal*
> > Direct: 303.905.3171 <3039053171> | Email: kdunn@pivotal.io
> >
>
-- 
*Kyle Dunn | Data Engineering | Pivotal*
Direct: 303.905.3171 <3039053171> | Email: kdunn@pivotal.io

Re: HAWQ standby master sync process

Posted by Ming Li <ml...@pivotal.io>.

Hi Kyle,

As for your question how to config standby host, when standby nodes(which
is config in hawq-site.xml) started, it will auto registered it's info in
the system table gp_segment_configuration(
there is system table:
http://hdb.docs.pivotal.io/20/reference/catalog/gp_segment_configuration.html),
so that hawq can use this info internally in catalog.  if you need more
details about it, @wen lin can help you.

Then standby will report the LSN of WALs it synched to master node, master
node according to this LSN to test the gap between master and node is still
in xlog file or it is overwritten (because xlog file recycled). If the gap
is not in the xlog file, we cannot do further just report "out of sync",
which need to manually run hawq init standby to recreate standby node; else
we just push the WAL after this LSN to standby node, and redo them. All
related standby script problem can ask @radar for help.

In most cases the standby should be less workload than master, so I
suggestion maybe we can implement it as:
(1) Master push WAL to standby node, when standby received them, it firstly
write to file, then report successfully to master so that no blocking
transaction commit.
(2) standby node redo them on this node, and at the same time, it need to
guarantee that the WAL should be transferred to the remote DR node, we can
set different sync policy (whether need to guarantee WAL transferred to
remote node when transaction committed ) in case of different transaction
commit latency and different data loss acceptance at remote node.

More to discussed:
(1) If standby "report out of sync" and gap is not available on master
node, we need to reinit standby manually, which need to shutdown master
node. We need to think an stronger policy for this scenario, e.g. just push
WAL to other nodes, and write as duplicate file? or we can further to write
into hdfs directly?
(2) If multiple master feature implemented, maybe the design need to be
changed. I don't take time on it.

Any comments or suggestions are welcomed. Thanks.

On Fri, Sep 9, 2016 at 1:22 AM, Kyle Dunn <kd...@pivotal.io> wrote:

> Ming -
>
> Thank you for the info, this is very helpful in understanding how WAL
> shipment happens.
>
> One question I have is: if/where the destination host is configured in
> walsendserver.c? Alternatively, does a standby master client initiate the
> request rather than the active master pushing out WALs as they become
> available? I ask because for a more robust DR solution than what I'm
> currently working on would allow multiple standby targets (i.e. one
> traditional standby, one DR mirror, etc.)
>
> At the moment I've opted for an approach that stops the active HAWQ master,
> creates a tarball of the entire MASTER_DATA_DIRECTORY, archives it on HDFS,
> then invokes distcp via Apache Falcon to mirror /hawq_default in HDFS to
> the DR site. After a DR event there would be some manual process to restore
> said archive and update the hostname / DFS references to reflect the actual
> DR environment.
>
> This approach is a step in the right direction but the act of creating the
> tarball necessitates a brief HAWQ master outage (currently ~1 minute when
> excluding pg_log contents and not compressing), whereas extending the
> walserver code could avoid any outage by allowing WAL replication to have
> multiple destinations.
>
> The top-level code for orchestrating this process is currently written in
> Python 2.6 compatible code - I'd like to have some review of it by the DEV
> team, if possible, as a first step to a future PR for "HAWQ DR" via Falcon.
>
> Thoughts?
>
>
> -Kyle
>
> On Mon, Sep 5, 2016 at 9:41 AM Ming Li <ml...@pivotal.io> wrote:
>
> > Hi,
> >
> > The general idea please refer to PostgreSQL:
> >
> > https://www.pgcon.org/2008/schedule/attachments/61_
> Synchronous%20Log%20Shipping%20Replication.pdf
> >
> >
> > Here just share some info about standby code.
> >
> > The standby related code is here:
> > src/backend/postmaster/walredoserver.c
> > src/backend/postmaster/walsendserver.c
> >
> > Global pic:
> > - Backend generate WAL and pass it to the forked process "WAL Sender",
> the
> > calling stack is: XLogQDMirrorWrite() => WalSendServerClientSendRequest
> ()
> >
> > - "WAL sender" process will be forked up and loop for processing request
> > and response, the calling stack is:
> > walsendserver_forkexec() -> walsendserver_start() -> ServiceMain() ->
> > ServiceListenLoop() -> ServiceProcessRequest() ->
> > serviceConfig->ServiceRequest()
> > -> WalSendServer_ServiceRequest()
> >
> > - "WAL Sender" send WAL to "WAL Receiver" which is on the standby node,
> the
> > calling stack is:
> > WalSendServer_ServiceRequest() => WalSendServerDoRequest() =>
> > disconnectMirrorQD_SendClose() => write_qd_sync() => PQsendQuery()
> >
> > - On the standby side, all API are similar,  e.g.
> walredoserver_forkexec()
> > vs walsendserver_forkexec()
> >
> > Hope it helps you! ~_~
> >
> >
> >
> > On Thu, Aug 11, 2016 at 1:09 AM, Kyle Dunn <kd...@pivotal.io> wrote:
> >
> > > Hello,
> > >
> > > I'm investigating DR options for HAWQ and was curious about the
> existing
> > > master catalog synchronization process. My question is mainly around
> what
> > > this process does at a high level and where I might look in the code
> base
> > > or management tools to see about extending it for additional standby
> > > masters (e.g. one in a geographically distant data center and/or
> > different
> > > logical HAWQ cluster). The assumption is the HDFS blocks would be
> > > replicated by something like distcp via Falcon.
> > >
> > > I believe there are obvious things to address like DFS / namenode URI
> > > parameters, FQDNs, and certainly failure scenarios / edge cases, but
> I'm
> > > mainly trying to get a dialog started to see what input, ideas, and
> > > considerations others have. One thing I'm specifically interested in is
> > > whether / how WAL can be used (@Keaton).
> > >
> > >
> > > Thanks,
> > > Kyle
> > > --
> > > *Kyle Dunn | Data Engineering | Pivotal*
> > > Direct: 303.905.3171 <3039053171> | Email: kdunn@pivotal.io
> > >
> >
> --
> *Kyle Dunn | Data Engineering | Pivotal*
> Direct: 303.905.3171 <3039053171> | Email: kdunn@pivotal.io
>

Re: HAWQ standby master sync process

Posted by Kyle Dunn <kd...@pivotal.io>.

Ming -

Thank you for the info, this is very helpful in understanding how WAL
shipment happens.

One question I have is: if/where the destination host is configured in
walsendserver.c? Alternatively, does a standby master client initiate the
request rather than the active master pushing out WALs as they become
available? I ask because for a more robust DR solution than what I'm
currently working on would allow multiple standby targets (i.e. one
traditional standby, one DR mirror, etc.)

At the moment I've opted for an approach that stops the active HAWQ master,
creates a tarball of the entire MASTER_DATA_DIRECTORY, archives it on HDFS,
then invokes distcp via Apache Falcon to mirror /hawq_default in HDFS to
the DR site. After a DR event there would be some manual process to restore
said archive and update the hostname / DFS references to reflect the actual
DR environment.

This approach is a step in the right direction but the act of creating the
tarball necessitates a brief HAWQ master outage (currently ~1 minute when
excluding pg_log contents and not compressing), whereas extending the
walserver code could avoid any outage by allowing WAL replication to have
multiple destinations.

The top-level code for orchestrating this process is currently written in
Python 2.6 compatible code - I'd like to have some review of it by the DEV
team, if possible, as a first step to a future PR for "HAWQ DR" via Falcon.

Thoughts?

-Kyle

On Mon, Sep 5, 2016 at 9:41 AM Ming Li <ml...@pivotal.io> wrote:

> Hi,
>
> The general idea please refer to PostgreSQL:
>
> https://www.pgcon.org/2008/schedule/attachments/61_Synchronous%20Log%20Shipping%20Replication.pdf
>
>
> Here just share some info about standby code.
>
> The standby related code is here:
> src/backend/postmaster/walredoserver.c
> src/backend/postmaster/walsendserver.c
>
> Global pic:
> - Backend generate WAL and pass it to the forked process "WAL Sender",  the
> calling stack is: XLogQDMirrorWrite() => WalSendServerClientSendRequest()
>
> - "WAL sender" process will be forked up and loop for processing request
> and response, the calling stack is:
> walsendserver_forkexec() -> walsendserver_start() -> ServiceMain() ->
> ServiceListenLoop() -> ServiceProcessRequest() ->
> serviceConfig->ServiceRequest()
> -> WalSendServer_ServiceRequest()
>
> - "WAL Sender" send WAL to "WAL Receiver" which is on the standby node, the
> calling stack is:
> WalSendServer_ServiceRequest() => WalSendServerDoRequest() =>
> disconnectMirrorQD_SendClose() => write_qd_sync() => PQsendQuery()
>
> - On the standby side, all API are similar,  e.g. walredoserver_forkexec()
> vs walsendserver_forkexec()
>
> Hope it helps you! ~_~
>
>
>
> On Thu, Aug 11, 2016 at 1:09 AM, Kyle Dunn <kd...@pivotal.io> wrote:
>
> > Hello,
> >
> > I'm investigating DR options for HAWQ and was curious about the existing
> > master catalog synchronization process. My question is mainly around what
> > this process does at a high level and where I might look in the code base
> > or management tools to see about extending it for additional standby
> > masters (e.g. one in a geographically distant data center and/or
> different
> > logical HAWQ cluster). The assumption is the HDFS blocks would be
> > replicated by something like distcp via Falcon.
> >
> > I believe there are obvious things to address like DFS / namenode URI
> > parameters, FQDNs, and certainly failure scenarios / edge cases, but I'm
> > mainly trying to get a dialog started to see what input, ideas, and
> > considerations others have. One thing I'm specifically interested in is
> > whether / how WAL can be used (@Keaton).
> >
> >
> > Thanks,
> > Kyle
> > --
> > *Kyle Dunn | Data Engineering | Pivotal*
> > Direct: 303.905.3171 <3039053171> | Email: kdunn@pivotal.io
> >
>
-- 
*Kyle Dunn | Data Engineering | Pivotal*
Direct: 303.905.3171 <3039053171> | Email: kdunn@pivotal.io