You are viewing a plain text version of this content. The canonical link for it is here.

Posted to derby-dev@db.apache.org by "Øystein Grøvlen (JIRA)" <ji...@apache.org> on 2008/02/04 11:31:08 UTC

[jira] Created: (DERBY-3382) Replication: Slave must inform master if DBs are out of sync.

Replication: Slave must inform master if DBs are out of sync.
-------------------------------------------------------------

                 Key: DERBY-3382
                 URL: https://issues.apache.org/jira/browse/DERBY-3382
             Project: Derby
          Issue Type: Bug
          Components: Replication
    Affects Versions: 10.4.0.0
            Reporter: Øystein Grøvlen
             Fix For: 10.4.0.0


If I copy the database to the slave before booting the master, slave will be out of sync with the master since new log records are created during booting.  The slave will then stop replication, but the master will not be notified.
If I then try to stop or failover the master the master will hang.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-3382) Replication: Slave must inform master if DBs are out of sync.

Posted by "Jørgen Løland (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jørgen Løland updated DERBY-3382:
---------------------------------

    Attachment: derby-3382-test-1a.diff
                derby-3382-test-1a.stat

The attached patch contains a regression test for this issue.

> Replication: Slave must inform master if DBs are out of sync.
> -------------------------------------------------------------
>
>                 Key: DERBY-3382
>                 URL: https://issues.apache.org/jira/browse/DERBY-3382
>             Project: Derby
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 10.4.0.0
>            Reporter: Øystein Grøvlen
>            Assignee: Jørgen Løland
>             Fix For: 10.4.0.0
>
>         Attachments: derby-3382-1a.diff, derby-3382-1a.stat, derby-3382-1b.diff, derby-3382-1b.stat, derby-3382-test-1a.diff, derby-3382-test-1a.stat
>
>
> If I copy the database to the slave before booting the master, slave will be out of sync with the master since new log records are created during booting.  The slave will then stop replication, but the master will not be notified.
> If I then try to stop or failover the master the master will hang.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DERBY-3382) Replication: Slave must inform master if DBs are out of sync.

Posted by "Øystein Grøvlen (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DERBY-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12569277#action_12569277 ] 

Øystein Grøvlen commented on DERBY-3382:
----------------------------------------

Jørgen Løland (JIRA) wrote:
> Seems like LogToFile was never informed that it is no longer in replication mode :-/

Strictly speaking, it has never been in replication mode since replication startup did not succeed. 

> Replication: Slave must inform master if DBs are out of sync.
> -------------------------------------------------------------
>
>                 Key: DERBY-3382
>                 URL: https://issues.apache.org/jira/browse/DERBY-3382
>             Project: Derby
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 10.4.0.0
>            Reporter: Øystein Grøvlen
>            Assignee: Jørgen Løland
>             Fix For: 10.4.0.0
>
>         Attachments: derby-3382-1a.diff, derby-3382-1a.stat
>
>
> If I copy the database to the slave before booting the master, slave will be out of sync with the master since new log records are created during booting.  The slave will then stop replication, but the master will not be notified.
> If I then try to stop or failover the master the master will hang.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-3382) Replication: Slave must inform master if DBs are out of sync.

Posted by "Øystein Grøvlen (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Øystein Grøvlen updated DERBY-3382:
-----------------------------------

    Derby Info:   (was: [Patch Available])

Thanks for addressing all my comments Jørgen.
Patch derby-3382-test-1b.diff  committed at revision 641221.

> Replication: Slave must inform master if DBs are out of sync.
> -------------------------------------------------------------
>
>                 Key: DERBY-3382
>                 URL: https://issues.apache.org/jira/browse/DERBY-3382
>             Project: Derby
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 10.4.0.0
>            Reporter: Øystein Grøvlen
>            Assignee: Jørgen Løland
>             Fix For: 10.4.0.0
>
>         Attachments: derby-3382-1a.diff, derby-3382-1a.stat, derby-3382-1b.diff, derby-3382-1b.stat, derby-3382-test-1a.diff, derby-3382-test-1a.stat, derby-3382-test-1b.diff, derby-3382-test-1b.stat
>
>
> If I copy the database to the slave before booting the master, slave will be out of sync with the master since new log records are created during booting.  The slave will then stop replication, but the master will not be notified.
> If I then try to stop or failover the master the master will hang.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Closed: (DERBY-3382) Replication: Slave must inform master if DBs are out of sync.

Posted by "Øystein Grøvlen (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Øystein Grøvlen closed DERBY-3382.
----------------------------------


> Replication: Slave must inform master if DBs are out of sync.
> -------------------------------------------------------------
>
>                 Key: DERBY-3382
>                 URL: https://issues.apache.org/jira/browse/DERBY-3382
>             Project: Derby
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 10.4.0.0
>            Reporter: Øystein Grøvlen
>            Assignee: Jørgen Løland
>             Fix For: 10.4.0.0
>
>         Attachments: derby-3382-1a.diff, derby-3382-1a.stat, derby-3382-1b.diff, derby-3382-1b.stat, derby-3382-test-1a.diff, derby-3382-test-1a.stat, derby-3382-test-1b.diff, derby-3382-test-1b.stat
>
>
> If I copy the database to the slave before booting the master, slave will be out of sync with the master since new log records are created during booting.  The slave will then stop replication, but the master will not be notified.
> If I then try to stop or failover the master the master will hang.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-3382) Replication: Slave must inform master if DBs are out of sync.

Posted by "Jørgen Løland (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jørgen Løland updated DERBY-3382:
---------------------------------

    Attachment: derby-3382-1b.diff
                derby-3382-1b.stat

Patch 1b addresses Øysteins comments:

* The out of synch error message is now displayed (no longer
  wrapped in an XRE04 exception)
* The NPE is removed by calling
  logToFile.stopReplicationSlaveRole if startMaster fails. Also
  added cleanup of the network connection if startMaster fails
1 and 2: fixed
3: Would getChunkLastInstant (or getChunkInstant) be more
   intuitive? In that case I think we should add 'Chunk' to
   getData and getSize as well. In patch 1b I only changed the
   javadoc slightly.
4: Good catch! I hadn't noticed that method, but I still need the
   long representation. Changed name of the method to match the
   existing getFirstUnflushedInstant

All tests passed.

> Replication: Slave must inform master if DBs are out of sync.
> -------------------------------------------------------------
>
>                 Key: DERBY-3382
>                 URL: https://issues.apache.org/jira/browse/DERBY-3382
>             Project: Derby
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 10.4.0.0
>            Reporter: Øystein Grøvlen
>            Assignee: Jørgen Løland
>             Fix For: 10.4.0.0
>
>         Attachments: derby-3382-1a.diff, derby-3382-1a.stat, derby-3382-1b.diff, derby-3382-1b.stat
>
>
> If I copy the database to the slave before booting the master, slave will be out of sync with the master since new log records are created during booting.  The slave will then stop replication, but the master will not be notified.
> If I then try to stop or failover the master the master will hang.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-3382) Replication: Slave must inform master if DBs are out of sync.

Posted by "Jørgen Løland (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jørgen Løland updated DERBY-3382:
---------------------------------

    Attachment: derby-3382-1a.diff
                derby-3382-1a.stat

Patch v1a adds a check of the log files to the replication initialization so that replication does not start if the log files are out of synch. The master will be notified whether or not the log files are synched.

All tests passed, the patch is ready for review.

> Replication: Slave must inform master if DBs are out of sync.
> -------------------------------------------------------------
>
>                 Key: DERBY-3382
>                 URL: https://issues.apache.org/jira/browse/DERBY-3382
>             Project: Derby
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 10.4.0.0
>            Reporter: Øystein Grøvlen
>            Assignee: Jørgen Løland
>             Fix For: 10.4.0.0
>
>         Attachments: derby-3382-1a.diff, derby-3382-1a.stat
>
>
> If I copy the database to the slave before booting the master, slave will be out of sync with the master since new log records are created during booting.  The slave will then stop replication, but the master will not be notified.
> If I then try to stop or failover the master the master will hang.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DERBY-3382) Replication: Slave must inform master if DBs are out of sync.

Posted by "Øystein Grøvlen (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DERBY-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12568903#action_12568903 ] 

Øystein Grøvlen commented on DERBY-3382:
----------------------------------------

When I test this, I do not get the error message I expected on the master.  Below is an example where I do not connect to the masterDB and freeze it before it is copied.  Instead of the out-of-synch message, I get a could not establish connection message.  The slave prints the out-of-synch message.  Note also that if I try a normal connect afterwards, a first attempt to freeze the DB will fail with NPE, both a second attempt to call freeze or 'quit' will then hang.

ij version 10.4
ij> connect 'jdbc:derby:masterDB;user=oystein;password=pass;startMaster=true;slaveHost=localhost';
ERROR XRE04: Could not establish a connection to the peer of the replicated database 'masterDB' on address 'localhost:-1'.
ij> connect 'jdbc:derby:masterDB;user=oystein;password=pass';
ij> select sum(i), avg(i), count(*), max(i) from t;
1          |2          |3          |4          
-----------------------------------------------
528        |16         |32         |32         

1 row selected
ij> call syscs_util.syscs_freeze_database();
ERROR 38000: The exception 'java.lang.NullPointerException' was thrown while evaluating an expression.
ERROR XJ001: Java exception: ': java.lang.NullPointerException'.
ij> quit
<hangs>

> Replication: Slave must inform master if DBs are out of sync.
> -------------------------------------------------------------
>
>                 Key: DERBY-3382
>                 URL: https://issues.apache.org/jira/browse/DERBY-3382
>             Project: Derby
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 10.4.0.0
>            Reporter: Øystein Grøvlen
>            Assignee: Jørgen Løland
>             Fix For: 10.4.0.0
>
>         Attachments: derby-3382-1a.diff, derby-3382-1a.stat
>
>
> If I copy the database to the slave before booting the master, slave will be out of sync with the master since new log records are created during booting.  The slave will then stop replication, but the master will not be notified.
> If I then try to stop or failover the master the master will hang.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-3382) Replication: Slave must inform master if DBs are out of sync.

Posted by "Jørgen Løland (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jørgen Løland updated DERBY-3382:
---------------------------------

    Derby Info: [Patch Available]

> Replication: Slave must inform master if DBs are out of sync.
> -------------------------------------------------------------
>
>                 Key: DERBY-3382
>                 URL: https://issues.apache.org/jira/browse/DERBY-3382
>             Project: Derby
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 10.4.0.0
>            Reporter: Øystein Grøvlen
>            Assignee: Jørgen Løland
>             Fix For: 10.4.0.0
>
>         Attachments: derby-3382-1a.diff, derby-3382-1a.stat
>
>
> If I copy the database to the slave before booting the master, slave will be out of sync with the master since new log records are created during booting.  The slave will then stop replication, but the master will not be notified.
> If I then try to stop or failover the master the master will hang.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-3382) Replication: Slave must inform master if DBs are out of sync.

Posted by "Jørgen Løland (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jørgen Løland updated DERBY-3382:
---------------------------------

    Attachment: derby-3382-test-1b.diff
                derby-3382-test-1b.stat

Thank you for the review, Øystein. Patch 1b addresses your comments.

After updating my sandbox, I got the same exception as you did. It turned out to be caused by another patch invalidating mine. The replication test suite completed successfully. Requesting review.

> Replication: Slave must inform master if DBs are out of sync.
> -------------------------------------------------------------
>
>                 Key: DERBY-3382
>                 URL: https://issues.apache.org/jira/browse/DERBY-3382
>             Project: Derby
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 10.4.0.0
>            Reporter: Øystein Grøvlen
>            Assignee: Jørgen Løland
>             Fix For: 10.4.0.0
>
>         Attachments: derby-3382-1a.diff, derby-3382-1a.stat, derby-3382-1b.diff, derby-3382-1b.stat, derby-3382-test-1a.diff, derby-3382-test-1a.stat, derby-3382-test-1b.diff, derby-3382-test-1b.stat
>
>
> If I copy the database to the slave before booting the master, slave will be out of sync with the master since new log records are created during booting.  The slave will then stop replication, but the master will not be notified.
> If I then try to stop or failover the master the master will hang.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DERBY-3382) Replication: Slave must inform master if DBs are out of sync.

Posted by "Øystein Grøvlen (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DERBY-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12571408#action_12571408 ] 

Øystein Grøvlen commented on DERBY-3382:
----------------------------------------

Thanks for the follow-up patch, Jørgen.  It adresses my comments, and
I have checked that the problems I reported has been fixed.  I will
commit this patch.  Some minor issues in MasterController that you may
consider for a later patch:

 * teardownNetwork: I am not sure this is a good name for the method
   since it is actually doing more than just shutting down the network
   connection.

 * setupConnection:  What is the point of catching StandardException
   if you are just rethrowing it?



> Replication: Slave must inform master if DBs are out of sync.
> -------------------------------------------------------------
>
>                 Key: DERBY-3382
>                 URL: https://issues.apache.org/jira/browse/DERBY-3382
>             Project: Derby
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 10.4.0.0
>            Reporter: Øystein Grøvlen
>            Assignee: Jørgen Løland
>             Fix For: 10.4.0.0
>
>         Attachments: derby-3382-1a.diff, derby-3382-1a.stat, derby-3382-1b.diff, derby-3382-1b.stat
>
>
> If I copy the database to the slave before booting the master, slave will be out of sync with the master since new log records are created during booting.  The slave will then stop replication, but the master will not be notified.
> If I then try to stop or failover the master the master will hang.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DERBY-3382) Replication: Slave must inform master if DBs are out of sync.

Posted by "Øystein Grøvlen (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DERBY-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12578664#action_12578664 ] 

Øystein Grøvlen commented on DERBY-3382:
----------------------------------------

The error reported above when running in an empty directory may not be related to this patch anyway.  I just got it on the 10.4 branch, too.  A bit strange that it disappeared when I removed this patch, though.

> Replication: Slave must inform master if DBs are out of sync.
> -------------------------------------------------------------
>
>                 Key: DERBY-3382
>                 URL: https://issues.apache.org/jira/browse/DERBY-3382
>             Project: Derby
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 10.4.0.0
>            Reporter: Øystein Grøvlen
>            Assignee: Jørgen Løland
>             Fix For: 10.4.0.0
>
>         Attachments: derby-3382-1a.diff, derby-3382-1a.stat, derby-3382-1b.diff, derby-3382-1b.stat, derby-3382-test-1a.diff, derby-3382-test-1a.stat
>
>
> If I copy the database to the slave before booting the master, slave will be out of sync with the master since new log records are created during booting.  The slave will then stop replication, but the master will not be notified.
> If I then try to stop or failover the master the master will hang.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (DERBY-3382) Replication: Slave must inform master if DBs are out of sync.

Posted by "Jørgen Løland (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jørgen Løland resolved DERBY-3382.
----------------------------------

    Resolution: Fixed

Resolving as fixed - Issue fixed, and regression test committed.

> Replication: Slave must inform master if DBs are out of sync.
> -------------------------------------------------------------
>
>                 Key: DERBY-3382
>                 URL: https://issues.apache.org/jira/browse/DERBY-3382
>             Project: Derby
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 10.4.0.0
>            Reporter: Øystein Grøvlen
>            Assignee: Jørgen Løland
>             Fix For: 10.4.0.0
>
>         Attachments: derby-3382-1a.diff, derby-3382-1a.stat, derby-3382-1b.diff, derby-3382-1b.stat, derby-3382-test-1a.diff, derby-3382-test-1a.stat, derby-3382-test-1b.diff, derby-3382-test-1b.stat
>
>
> If I copy the database to the slave before booting the master, slave will be out of sync with the master since new log records are created during booting.  The slave will then stop replication, but the master will not be notified.
> If I then try to stop or failover the master the master will hang.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DERBY-3382) Replication: Slave must inform master if DBs are out of sync.

Posted by "Jørgen Løland (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DERBY-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12569227#action_12569227 ] 

Jørgen Løland commented on DERBY-3382:
--------------------------------------

I had a look at the hang, which is caused by LogToFile calling masterFac#flushedTo:

lines 3966 and on...: 
if (inReplicationMasterMode) {
        masterFactory.flushedTo(LogCounter.
                                                      makeLogInstantAsLong(fileNumber,
                                                                                                  wherePosition));
}

Seems like LogToFile was never informed that it is no longer in replication mode :-/

> Replication: Slave must inform master if DBs are out of sync.
> -------------------------------------------------------------
>
>                 Key: DERBY-3382
>                 URL: https://issues.apache.org/jira/browse/DERBY-3382
>             Project: Derby
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 10.4.0.0
>            Reporter: Øystein Grøvlen
>            Assignee: Jørgen Løland
>             Fix For: 10.4.0.0
>
>         Attachments: derby-3382-1a.diff, derby-3382-1a.stat
>
>
> If I copy the database to the slave before booting the master, slave will be out of sync with the master since new log records are created during booting.  The slave will then stop replication, but the master will not be notified.
> If I then try to stop or failover the master the master will hang.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DERBY-3382) Replication: Slave must inform master if DBs are out of sync.

Posted by "Øystein Grøvlen (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DERBY-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12568907#action_12568907 ] 

Øystein Grøvlen commented on DERBY-3382:
----------------------------------------

Comments to the 1a patch:

1. ReplicationMessageReceive#parseAndAckInstant:  No need to inform
   the master in case of an unexpected message sequence?

2. MasterController:  If and else parts is identical for both changes.
   I guess you are supposed to use getHighestShippedInstance() for the
   if-part.

3. AsynchronousLogShipper:  I find the naming of
   ReplicationLogBuffer#getLastInstant() a bit confusing.  I first
   wondered whether it represented the last instant added to the
   ReplicationLogBuffer, but I guess it is the last instant of the
   first chunk of the buffer.

4. LogToFile#getFlushedInstant: Looks very similar to
   getFirstUnflushedInstant except it returns a long instead of a
   LogCounter.  However, the names seems to indicate that they are
   different.  By the way, the latter is synchronized while the former
   is not.  Is there a justification for that?




> Replication: Slave must inform master if DBs are out of sync.
> -------------------------------------------------------------
>
>                 Key: DERBY-3382
>                 URL: https://issues.apache.org/jira/browse/DERBY-3382
>             Project: Derby
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 10.4.0.0
>            Reporter: Øystein Grøvlen
>            Assignee: Jørgen Løland
>             Fix For: 10.4.0.0
>
>         Attachments: derby-3382-1a.diff, derby-3382-1a.stat
>
>
> If I copy the database to the slave before booting the master, slave will be out of sync with the master since new log records are created during booting.  The slave will then stop replication, but the master will not be notified.
> If I then try to stop or failover the master the master will hang.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-3382) Replication: Slave must inform master if DBs are out of sync.

Posted by "Øystein Grøvlen (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Øystein Grøvlen updated DERBY-3382:
-----------------------------------

    Fix Version/s:     (was: 10.4.0.0)
                   10.5.0.0
                   10.4.1.0

Test merged to the 10.4 branch at revision 641329.

> Replication: Slave must inform master if DBs are out of sync.
> -------------------------------------------------------------
>
>                 Key: DERBY-3382
>                 URL: https://issues.apache.org/jira/browse/DERBY-3382
>             Project: Derby
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 10.4.0.0
>            Reporter: Øystein Grøvlen
>            Assignee: Jørgen Løland
>             Fix For: 10.4.1.0, 10.5.0.0
>
>         Attachments: derby-3382-1a.diff, derby-3382-1a.stat, derby-3382-1b.diff, derby-3382-1b.stat, derby-3382-test-1a.diff, derby-3382-test-1a.stat, derby-3382-test-1b.diff, derby-3382-test-1b.stat
>
>
> If I copy the database to the slave before booting the master, slave will be out of sync with the master since new log records are created during booting.  The slave will then stop replication, but the master will not be notified.
> If I then try to stop or failover the master the master will hang.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DERBY-3382) Replication: Slave must inform master if DBs are out of sync.

Posted by "Øystein Grøvlen (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DERBY-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12571412#action_12571412 ] 

Øystein Grøvlen commented on DERBY-3382:
----------------------------------------

Jørgen Løland (JIRA) wrote:
> re setupConnection: it prevents the next catch (Exception) to wrap a StandardException in another StandardException

Ah, good point.  Objection withdrawn.


> Replication: Slave must inform master if DBs are out of sync.
> -------------------------------------------------------------
>
>                 Key: DERBY-3382
>                 URL: https://issues.apache.org/jira/browse/DERBY-3382
>             Project: Derby
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 10.4.0.0
>            Reporter: Øystein Grøvlen
>            Assignee: Jørgen Løland
>             Fix For: 10.4.0.0
>
>         Attachments: derby-3382-1a.diff, derby-3382-1a.stat, derby-3382-1b.diff, derby-3382-1b.stat
>
>
> If I copy the database to the slave before booting the master, slave will be out of sync with the master since new log records are created during booting.  The slave will then stop replication, but the master will not be notified.
> If I then try to stop or failover the master the master will hang.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (DERBY-3382) Replication: Slave must inform master if DBs are out of sync.

Posted by "Jørgen Løland (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jørgen Løland reassigned DERBY-3382:
------------------------------------

    Assignee: Jørgen Løland

> Replication: Slave must inform master if DBs are out of sync.
> -------------------------------------------------------------
>
>                 Key: DERBY-3382
>                 URL: https://issues.apache.org/jira/browse/DERBY-3382
>             Project: Derby
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 10.4.0.0
>            Reporter: Øystein Grøvlen
>            Assignee: Jørgen Løland
>             Fix For: 10.4.0.0
>
>
> If I copy the database to the slave before booting the master, slave will be out of sync with the master since new log records are created during booting.  The slave will then stop replication, but the master will not be notified.
> If I then try to stop or failover the master the master will hang.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-3382) Replication: Slave must inform master if DBs are out of sync.

Posted by "Øystein Grøvlen (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Øystein Grøvlen updated DERBY-3382:
-----------------------------------

    Fix Version/s:     (was: 10.4.1.0)
                       (was: 10.5.0.0)
                   10.4.0.0

Reset fix version.  Bug was fixed before 10.4 release branch was created.  It was just the test patch that needed to be merged.

> Replication: Slave must inform master if DBs are out of sync.
> -------------------------------------------------------------
>
>                 Key: DERBY-3382
>                 URL: https://issues.apache.org/jira/browse/DERBY-3382
>             Project: Derby
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 10.4.0.0
>            Reporter: Øystein Grøvlen
>            Assignee: Jørgen Løland
>             Fix For: 10.4.0.0
>
>         Attachments: derby-3382-1a.diff, derby-3382-1a.stat, derby-3382-1b.diff, derby-3382-1b.stat, derby-3382-test-1a.diff, derby-3382-test-1a.stat, derby-3382-test-1b.diff, derby-3382-test-1b.stat
>
>
> If I copy the database to the slave before booting the master, slave will be out of sync with the master since new log records are created during booting.  The slave will then stop replication, but the master will not be notified.
> If I then try to stop or failover the master the master will hang.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DERBY-3382) Replication: Slave must inform master if DBs are out of sync.

Posted by "Jørgen Løland (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DERBY-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12565300#action_12565300 ] 

Jørgen Løland commented on DERBY-3382:
--------------------------------------

An additional problem is that the slave does not realize that it is out of synch until the first chunk of log records is shipped. Finding out if master and slave are in synch should be part of the replication network initialization.

> Replication: Slave must inform master if DBs are out of sync.
> -------------------------------------------------------------
>
>                 Key: DERBY-3382
>                 URL: https://issues.apache.org/jira/browse/DERBY-3382
>             Project: Derby
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 10.4.0.0
>            Reporter: Øystein Grøvlen
>             Fix For: 10.4.0.0
>
>
> If I copy the database to the slave before booting the master, slave will be out of sync with the master since new log records are created during booting.  The slave will then stop replication, but the master will not be notified.
> If I then try to stop or failover the master the master will hang.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DERBY-3382) Replication: Slave must inform master if DBs are out of sync.

Posted by "Øystein Grøvlen (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DERBY-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12578502#action_12578502 ] 

Øystein Grøvlen commented on DERBY-3382:
----------------------------------------

Thanks for the test, Jørgen.  Test case looks good, but I have a few
comments on the modifications to the framework:

1. I suggest renaming slaveConnException to startSlaveException to
   indicate that it is specific to startSlave command.

2. slaveConnException should be set to null before starting the slave
   in order to make it possible to start the slave more than once in
   the same test.

3. Are you sure it is a good idea to use the connection obtained when
   starting the master for further operations on the master.  Would it
   not be more general if it was a separate connection?  Anyhow, if
   ReplicationRun is to provide the capability to use an existing
   connection, I think getMasterConnection needs to be able to open a
   new connection should no one exist.  (An alternative would be to
   see if it was possible to let ReplicationRun extend
   BaseJDBCTestCase and use its connection handling for connecting to
   the master database.)

4. Why do you need to copy the code from
   BaseJDBCTestCase#assertSQLState? That method is static so you
   should be able to use it as it is.

5. assertSQLStateSlaveConn: Javadoc should state that it will wait for
   some time for start slave command to complete.  Instead of "Slave
   connection attempt hangs ...", I would prefer "Attempt to start
   slave hangs ...", or something like that.

6. If I run the replication suite with this patch from an empty
   directory, it fails.  This does not happen without the patch:

> java junit.textui.TestRunner -noloading org.apache.derbyTesting.functionTests.tests.replicationTests.ReplicationSuite ..java.io.IOException: Cannot run program "/usr/local/java/jdk1.6.0_01/jre/lib/../bin/java" (in directory "/export/tmp/oysteing/derby-repl/testing_repl/export/tmp/oysteing/derby-repl/testing_repl/db_master"): error=2, No such file or directory
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:459)
        at java.lang.Runtime.exec(Runtime.java:593)
        at java.lang.Runtime.exec(Runtime.java:431)
        at org.apache.derbyTesting.functionTests.tests.replicationTests.ReplicationRun$4.run(ReplicationRun.java:2250)
        at java.lang.Thread.run(Thread.java:619)
Caused by: java.io.IOException: error=2, No such file or directory
        at java.lang.UNIXProcess.forkAndExec(Native Method)
        at java.lang.UNIXProcess.<init>(UNIXProcess.java:53)
        at java.lang.ProcessImpl.start(ProcessImpl.java:65)
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:452)
        ... 4 more
java.lang.Exception: DRDA_NoIO.S:Could not connect to Derby Network Server on host 127.0.0.1, port 1527: Connection refused
        at org.apache.derby.impl.drda.NetworkServerControlImpl.consolePropertyMessageWork(NetworkServerControlImpl.java:3173)
        at org.apache.derby.impl.drda.NetworkServerControlImpl.consolePropertyMessage(NetworkServerControlImpl.java:1855)
        at org.apache.derby.impl.drda.NetworkServerControlImpl.setUpSocket(NetworkServerControlImpl.java:2497)
        at org.apache.derby.impl.drda.NetworkServerControlImpl.ping(NetworkServerControlImpl.java:1138)
        at org.apache.derby.drda.NetworkServerControl.ping(NetworkServerControl.java:395)
        at org.apache.derbyTesting.functionTests.tests.replicationTests.ReplicationRun.ping(ReplicationRun.java:2600)
        at org.apache.derbyTesting.functionTests.tests.replicationTests.ReplicationRun.pingServer(ReplicationRun.java:2587)
        at org.apache.derbyTesting.functionTests.tests.replicationTests.ReplicationRun.startServer(ReplicationRun.java:2264)
        at org.apache.derbyTesting.functionTests.tests.replicationTests.ReplicationRun_Local.testLogFilesSynched(ReplicationRun_Local.java:182)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at junit.framework.TestCase.runTest(TestCase.java:164)
        at junit.framework.TestCase.runBare(TestCase.java:130)
        at org.apache.derbyTesting.junit.BaseTestCase.runBare(BaseTestCase.java:101)
        at junit.framework.TestResult$1.protect(TestResult.java:106)
        at junit.framework.TestResult.runProtected(TestResult.java:124)
        at junit.framework.TestResult.run(TestResult.java:109)
        at junit.framework.TestCase.run(TestCase.java:120)
        at junit.framework.TestSuite.runTest(TestSuite.java:230)
        at junit.framework.TestSuite.run(TestSuite.java:225)
        at junit.framework.TestSuite.runTest(TestSuite.java:230)
        at junit.framework.TestSuite.run(TestSuite.java:225)
        at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24)
        at junit.extensions.TestSetup$1.protect(TestSetup.java:21)
        at junit.framework.TestResult.runProtected(TestResult.java:124)
        at junit.extensions.TestSetup.run(TestSetup.java:25)
        at junit.framework.TestSuite.runTest(TestSuite.java:230)
        at junit.framework.TestSuite.run(TestSuite.java:225)
        at junit.textui.TestRunner.doRun(TestRunner.java:121)
        at junit.textui.TestRunner.start(TestRunner.java:185)
        at junit.textui.TestRunner.main(TestRunner.java:143)
java.io.IOException: Cannot run program "/usr/local/java/jdk1.6.0_01/jre/lib/../bin/java" (in directory "/export/tmp/oysteing/derby-repl/testing_repl/export/tmp/oysteing/derby-repl/testing_repl/db_slave"): error=2, No such file or directory        at java.lang.ProcessBuilder.start(ProcessBuilder.java:459)
        at java.lang.Runtime.exec(Runtime.java:593)
        at java.lang.Runtime.exec(Runtime.java:431)
        at org.apache.derbyTesting.functionTests.tests.replicationTests.ReplicationRun$4.run(ReplicationRun.java:2250)
        at java.lang.Thread.run(Thread.java:619)
Caused by: java.io.IOException: error=2, No such file or directory
        at java.lang.UNIXProcess.forkAndExec(Native Method)
        at java.lang.UNIXProcess.<init>(UNIXProcess.java:53)
        at java.lang.ProcessImpl.start(ProcessImpl.java:65)
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:452)
        ... 4 more
java.lang.Exception: DRDA_NoIO.S:Could not connect to Derby Network Server on host 127.0.0.1, port 4527: Connection refused
        at org.apache.derby.impl.drda.NetworkServerControlImpl.consolePropertyMessageWork(NetworkServerControlImpl.java:3173)
        at org.apache.derby.impl.drda.NetworkServerControlImpl.consolePropertyMessage(NetworkServerControlImpl.java:1855)
        at org.apache.derby.impl.drda.NetworkServerControlImpl.setUpSocket(NetworkServerControlImpl.java:2497)
        at org.apache.derby.impl.drda.NetworkServerControlImpl.ping(NetworkServerControlImpl.java:1138)
        at org.apache.derby.drda.NetworkServerControl.ping(NetworkServerControl.java:395)
        at org.apache.derbyTesting.functionTests.tests.replicationTests.ReplicationRun.ping(ReplicationRun.java:2600)
        at org.apache.derbyTesting.functionTests.tests.replicationTests.ReplicationRun.pingServer(ReplicationRun.java:2587)
        at org.apache.derbyTesting.functionTests.tests.replicationTests.ReplicationRun.startServer(ReplicationRun.java:2264)
        at org.apache.derbyTesting.functionTests.tests.replicationTests.ReplicationRun_Local.testLogFilesSynched(ReplicationRun_Local.java:188)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at junit.framework.TestCase.runTest(TestCase.java:164)
        at junit.framework.TestCase.runBare(TestCase.java:130)
        at org.apache.derbyTesting.junit.BaseTestCase.runBare(BaseTestCase.java:101)
        at junit.framework.TestResult$1.protect(TestResult.java:106)
        at junit.framework.TestResult.runProtected(TestResult.java:124)
        at junit.framework.TestResult.run(TestResult.java:109)
        at junit.framework.TestCase.run(TestCase.java:120)
        at junit.framework.TestSuite.runTest(TestSuite.java:230)
        at junit.framework.TestSuite.run(TestSuite.java:225)
        at junit.framework.TestSuite.runTest(TestSuite.java:230)
        at junit.framework.TestSuite.run(TestSuite.java:225)
        at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24)
        at junit.extensions.TestSetup$1.protect(TestSetup.java:21)
        at junit.framework.TestResult.runProtected(TestResult.java:124)
        at junit.extensions.TestSetup.run(TestSetup.java:25)
        at junit.framework.TestSuite.runTest(TestSuite.java:230)
        at junit.framework.TestSuite.run(TestSuite.java:225)
        at junit.textui.TestRunner.doRun(TestRunner.java:121)
        at junit.textui.TestRunner.start(TestRunner.java:185)
        at junit.textui.TestRunner.main(TestRunner.java:143)
E.^C
>


> Replication: Slave must inform master if DBs are out of sync.
> -------------------------------------------------------------
>
>                 Key: DERBY-3382
>                 URL: https://issues.apache.org/jira/browse/DERBY-3382
>             Project: Derby
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 10.4.0.0
>            Reporter: Øystein Grøvlen
>            Assignee: Jørgen Løland
>             Fix For: 10.4.0.0
>
>         Attachments: derby-3382-1a.diff, derby-3382-1a.stat, derby-3382-1b.diff, derby-3382-1b.stat, derby-3382-test-1a.diff, derby-3382-test-1a.stat
>
>
> If I copy the database to the slave before booting the master, slave will be out of sync with the master since new log records are created during booting.  The slave will then stop replication, but the master will not be notified.
> If I then try to stop or failover the master the master will hang.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DERBY-3382) Replication: Slave must inform master if DBs are out of sync.

Posted by "Jørgen Løland (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DERBY-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12571410#action_12571410 ] 

Jørgen Løland commented on DERBY-3382:
--------------------------------------

Hi Øystein, thanks for committing the patch.

re setupConnection: it prevents the next catch (Exception) to wrap a StandardException in another StandardException

> Replication: Slave must inform master if DBs are out of sync.
> -------------------------------------------------------------
>
>                 Key: DERBY-3382
>                 URL: https://issues.apache.org/jira/browse/DERBY-3382
>             Project: Derby
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 10.4.0.0
>            Reporter: Øystein Grøvlen
>            Assignee: Jørgen Løland
>             Fix For: 10.4.0.0
>
>         Attachments: derby-3382-1a.diff, derby-3382-1a.stat, derby-3382-1b.diff, derby-3382-1b.stat
>
>
> If I copy the database to the slave before booting the master, slave will be out of sync with the master since new log records are created during booting.  The slave will then stop replication, but the master will not be notified.
> If I then try to stop or failover the master the master will hang.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-3382) Replication: Slave must inform master if DBs are out of sync.

Posted by "Øystein Grøvlen (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Øystein Grøvlen updated DERBY-3382:
-----------------------------------

    Derby Info:   (was: [Patch Available])

Committed patch 1b as revision 630207.

> Replication: Slave must inform master if DBs are out of sync.
> -------------------------------------------------------------
>
>                 Key: DERBY-3382
>                 URL: https://issues.apache.org/jira/browse/DERBY-3382
>             Project: Derby
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 10.4.0.0
>            Reporter: Øystein Grøvlen
>            Assignee: Jørgen Løland
>             Fix For: 10.4.0.0
>
>         Attachments: derby-3382-1a.diff, derby-3382-1a.stat, derby-3382-1b.diff, derby-3382-1b.stat
>
>
> If I copy the database to the slave before booting the master, slave will be out of sync with the master since new log records are created during booting.  The slave will then stop replication, but the master will not be notified.
> If I then try to stop or failover the master the master will hang.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (DERBY-3382) Replication: Slave must inform master if DBs are out of sync.

Posted by "Jørgen Løland (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DERBY-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12569227#action_12569227 ] 

jorgenlo edited comment on DERBY-3382 at 2/15/08 2:40 AM:
---------------------------------------------------------------

I had a look at the NPE, which is caused by LogToFile calling masterFactory#flushedTo:

LogToFile lines 3966 and on...: 
if (inReplicationMasterMode) {
        masterFactory.flushedTo(LogCounter.
                                                      makeLogInstantAsLong(fileNumber,
                                                                                                  wherePosition));
}

Seems like LogToFile was never informed that it is no longer in replication mode :-/

      was (Author: jorgenlo):
    I had a look at the hang, which is caused by LogToFile calling masterFac#flushedTo:

lines 3966 and on...: 
if (inReplicationMasterMode) {
        masterFactory.flushedTo(LogCounter.
                                                      makeLogInstantAsLong(fileNumber,
                                                                                                  wherePosition));
}

Seems like LogToFile was never informed that it is no longer in replication mode :-/
  
> Replication: Slave must inform master if DBs are out of sync.
> -------------------------------------------------------------
>
>                 Key: DERBY-3382
>                 URL: https://issues.apache.org/jira/browse/DERBY-3382
>             Project: Derby
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 10.4.0.0
>            Reporter: Øystein Grøvlen
>            Assignee: Jørgen Løland
>             Fix For: 10.4.0.0
>
>         Attachments: derby-3382-1a.diff, derby-3382-1a.stat
>
>
> If I copy the database to the slave before booting the master, slave will be out of sync with the master since new log records are created during booting.  The slave will then stop replication, but the master will not be notified.
> If I then try to stop or failover the master the master will hang.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-3382) Replication: Slave must inform master if DBs are out of sync.

Posted by "Jørgen Løland (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jørgen Løland updated DERBY-3382:
---------------------------------

    Derby Info: [Patch Available]

> Replication: Slave must inform master if DBs are out of sync.
> -------------------------------------------------------------
>
>                 Key: DERBY-3382
>                 URL: https://issues.apache.org/jira/browse/DERBY-3382
>             Project: Derby
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 10.4.0.0
>            Reporter: Øystein Grøvlen
>            Assignee: Jørgen Løland
>             Fix For: 10.4.0.0
>
>         Attachments: derby-3382-1a.diff, derby-3382-1a.stat, derby-3382-1b.diff, derby-3382-1b.stat, derby-3382-test-1a.diff, derby-3382-test-1a.stat
>
>
> If I copy the database to the slave before booting the master, slave will be out of sync with the master since new log records are created during booting.  The slave will then stop replication, but the master will not be notified.
> If I then try to stop or failover the master the master will hang.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.