You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@accumulo.apache.org by Sean Busbey <se...@manvsbeard.com> on 2014/04/02 08:06:18 UTC

Re: Review Request 19804: ACCUMULO-2519 Aborts upgrade if there are Fate transactions from an old version.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19804/
-----------------------------------------------------------

(Updated April 2, 2014, 6:06 a.m.)


Review request for accumulo and kturner.


Changes
-------

Updated implementation to make sure that Fate isn't started until after we finish upgrading.


Bugs: ACCUMULO-2519
    https://issues.apache.org/jira/browse/ACCUMULO-2519


Repository: accumulo


Description
-------

Adds "make sure Fate has no outstanding items" to the upgrade instructions. Makes sure the master and tabletservers don't take upgrade steps if they see fate ops waiting.


Diffs (updated)
-----

  README 115a9b7 
  server/src/main/java/org/apache/accumulo/server/Accumulo.java 99ec7e4 
  server/src/main/java/org/apache/accumulo/server/master/Master.java 8c4c864 
  server/src/main/java/org/apache/accumulo/server/tabletserver/TabletServer.java d76946d 
  server/src/main/java/org/apache/accumulo/server/util/MetadataTable.java 7328a55 

Diff: https://reviews.apache.org/r/19804/diff/


Testing (updated)
-------

Took a 1.4.5-SNAP cluster

* loaded test data in a variety of table configs
* alternate table creation and deletion
* load additional table to cause !METADATA churn
* shutdown cluster uncleanly
* verified waiting Fate transactions (table deletion at success status)
* verified waiting local WALs
* verified waiting local WALs include !METADATA table (via LogReader)
* verified /accumulo/version showed 4
* Start upgrade to 1.5.2-SNAP
* verified errors showing no upgrade and to go back to docs in: monitor, master logs, tabletserver logs
* verified same waiting Fate transactions
* verified same waiting local WALs
* verified /accumulo/version showed 4
* Cleared Fate operations
* Start upgrade to 1.5.2-SNAP
* wait a terrifying long amount of time, check on progress via local logs
* verify no errors shown for upgrade
* verified WALs copied to HDFS
* verified /accumulo/version showed 5
* verified monitor showed normal start up
* wait for all tablets to be hosted
* verify test data


Thanks,

Sean Busbey


Re: Review Request 19804: ACCUMULO-2519 Aborts upgrade if there are Fate transactions from an old version.

Posted by Sean Busbey <se...@manvsbeard.com>.

> On April 2, 2014, 7:07 a.m., Mike Drob wrote:
> > server/src/main/java/org/apache/accumulo/server/master/Master.java, line 321
> > <https://reviews.apache.org/r/19804/diff/2/?file=544967#file544967line321>
> >
> >     If there is risk that multiple threads will enter this block, then the second thread will trigger the countdown latch in the else. If there is not (which I don't think there is, because this only gets called from inside of a synchronized method) then why do we perform this check?

It gets called from the Status thread, and we might transition into the state multiple times. the atomic boolean ensures we only ever enter the block once, regardless of how we go there.

That means the countdown latch is only triggered on the else of the "is an upgrade needed" not of the atomic boolean.


> On April 2, 2014, 7:07 a.m., Mike Drob wrote:
> > server/src/main/java/org/apache/accumulo/server/master/Master.java, line 352
> > <https://reviews.apache.org/r/19804/diff/2/?file=544967#file544967line352>
> >
> >     Instead of adding a countdown latch, could we have not just waited for this thread to complete? That seems more straightforward.

No, because the thread never exists if we don't need to do an upgrade.


- Sean


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19804/#review39241
-----------------------------------------------------------


On April 2, 2014, 6:06 a.m., Sean Busbey wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19804/
> -----------------------------------------------------------
> 
> (Updated April 2, 2014, 6:06 a.m.)
> 
> 
> Review request for accumulo and kturner.
> 
> 
> Bugs: ACCUMULO-2519
>     https://issues.apache.org/jira/browse/ACCUMULO-2519
> 
> 
> Repository: accumulo
> 
> 
> Description
> -------
> 
> Adds "make sure Fate has no outstanding items" to the upgrade instructions. Makes sure the master and tabletservers don't take upgrade steps if they see fate ops waiting.
> 
> 
> Diffs
> -----
> 
>   README 115a9b7 
>   server/src/main/java/org/apache/accumulo/server/Accumulo.java 99ec7e4 
>   server/src/main/java/org/apache/accumulo/server/master/Master.java 8c4c864 
>   server/src/main/java/org/apache/accumulo/server/tabletserver/TabletServer.java d76946d 
>   server/src/main/java/org/apache/accumulo/server/util/MetadataTable.java 7328a55 
> 
> Diff: https://reviews.apache.org/r/19804/diff/
> 
> 
> Testing
> -------
> 
> Took a 1.4.5-SNAP cluster
> 
> * loaded test data in a variety of table configs
> * alternate table creation and deletion
> * load additional table to cause !METADATA churn
> * shutdown cluster uncleanly
> * verified waiting Fate transactions (table deletion at success status)
> * verified waiting local WALs
> * verified waiting local WALs include !METADATA table (via LogReader)
> * verified /accumulo/version showed 4
> * Start upgrade to 1.5.2-SNAP
> * verified errors showing no upgrade and to go back to docs in: monitor, master logs, tabletserver logs
> * verified same waiting Fate transactions
> * verified same waiting local WALs
> * verified /accumulo/version showed 4
> * Cleared Fate operations
> * Start upgrade to 1.5.2-SNAP
> * wait a terrifying long amount of time, check on progress via local logs
> * verify no errors shown for upgrade
> * verified WALs copied to HDFS
> * verified /accumulo/version showed 5
> * verified monitor showed normal start up
> * wait for all tablets to be hosted
> * verify test data
> 
> 
> Thanks,
> 
> Sean Busbey
> 
>


Re: Review Request 19804: ACCUMULO-2519 Aborts upgrade if there are Fate transactions from an old version.

Posted by Mike Drob <md...@mdrob.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19804/#review39241
-----------------------------------------------------------



server/src/main/java/org/apache/accumulo/server/Accumulo.java
<https://reviews.apache.org/r/19804/#comment71600>

    Make a note here, or possibly elsewhere, that completed operations in the "SUCCESS" status will still cause an upgrade to fail.



server/src/main/java/org/apache/accumulo/server/master/Master.java
<https://reviews.apache.org/r/19804/#comment71603>

    If there is risk that multiple threads will enter this block, then the second thread will trigger the countdown latch in the else. If there is not (which I don't think there is, because this only gets called from inside of a synchronized method) then why do we perform this check?



server/src/main/java/org/apache/accumulo/server/master/Master.java
<https://reviews.apache.org/r/19804/#comment71602>

    Instead of adding a countdown latch, could we have not just waited for this thread to complete? That seems more straightforward.


- Mike Drob


On April 2, 2014, 6:06 a.m., Sean Busbey wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19804/
> -----------------------------------------------------------
> 
> (Updated April 2, 2014, 6:06 a.m.)
> 
> 
> Review request for accumulo and kturner.
> 
> 
> Bugs: ACCUMULO-2519
>     https://issues.apache.org/jira/browse/ACCUMULO-2519
> 
> 
> Repository: accumulo
> 
> 
> Description
> -------
> 
> Adds "make sure Fate has no outstanding items" to the upgrade instructions. Makes sure the master and tabletservers don't take upgrade steps if they see fate ops waiting.
> 
> 
> Diffs
> -----
> 
>   README 115a9b7 
>   server/src/main/java/org/apache/accumulo/server/Accumulo.java 99ec7e4 
>   server/src/main/java/org/apache/accumulo/server/master/Master.java 8c4c864 
>   server/src/main/java/org/apache/accumulo/server/tabletserver/TabletServer.java d76946d 
>   server/src/main/java/org/apache/accumulo/server/util/MetadataTable.java 7328a55 
> 
> Diff: https://reviews.apache.org/r/19804/diff/
> 
> 
> Testing
> -------
> 
> Took a 1.4.5-SNAP cluster
> 
> * loaded test data in a variety of table configs
> * alternate table creation and deletion
> * load additional table to cause !METADATA churn
> * shutdown cluster uncleanly
> * verified waiting Fate transactions (table deletion at success status)
> * verified waiting local WALs
> * verified waiting local WALs include !METADATA table (via LogReader)
> * verified /accumulo/version showed 4
> * Start upgrade to 1.5.2-SNAP
> * verified errors showing no upgrade and to go back to docs in: monitor, master logs, tabletserver logs
> * verified same waiting Fate transactions
> * verified same waiting local WALs
> * verified /accumulo/version showed 4
> * Cleared Fate operations
> * Start upgrade to 1.5.2-SNAP
> * wait a terrifying long amount of time, check on progress via local logs
> * verify no errors shown for upgrade
> * verified WALs copied to HDFS
> * verified /accumulo/version showed 5
> * verified monitor showed normal start up
> * wait for all tablets to be hosted
> * verify test data
> 
> 
> Thanks,
> 
> Sean Busbey
> 
>


Re: Review Request 19804: ACCUMULO-2519 Aborts upgrade if there are Fate transactions from an old version.

Posted by Sean Busbey <se...@manvsbeard.com>.

> On April 3, 2014, 8:08 p.m., kturner wrote:
> > server/src/main/java/org/apache/accumulo/server/master/Master.java, line 276
> > <https://reviews.apache.org/r/19804/diff/4/?file=545884#file545884line276>
> >
> >     this should be volatile because upgradeZookeeper() and upgradeMetadata() will be run by separate threads.

in reading the code, I thought upgradeZooKeeper had to happen prior to the thread that calls upgradeMetadata being created.  Am I reading the code wrong?

Master.run() calls getMasterLock (which is synchronous) and then several lines later creates the status thread and starts it.


- Sean


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19804/#review39468
-----------------------------------------------------------


On April 2, 2014, 3:10 p.m., Sean Busbey wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19804/
> -----------------------------------------------------------
> 
> (Updated April 2, 2014, 3:10 p.m.)
> 
> 
> Review request for accumulo and kturner.
> 
> 
> Bugs: ACCUMULO-2519
>     https://issues.apache.org/jira/browse/ACCUMULO-2519
> 
> 
> Repository: accumulo
> 
> 
> Description
> -------
> 
> Adds "make sure Fate has no outstanding items" to the upgrade instructions. Makes sure the master and tabletservers don't take upgrade steps if they see fate ops waiting.
> 
> 
> Diffs
> -----
> 
>   README 115a9b7 
>   server/src/main/java/org/apache/accumulo/server/Accumulo.java 99ec7e4 
>   server/src/main/java/org/apache/accumulo/server/master/Master.java 8c4c864 
>   server/src/main/java/org/apache/accumulo/server/tabletserver/TabletServer.java d76946d 
>   server/src/main/java/org/apache/accumulo/server/util/MetadataTable.java 7328a55 
> 
> Diff: https://reviews.apache.org/r/19804/diff/
> 
> 
> Testing
> -------
> 
> Took a 1.4.5-SNAP cluster
> 
> * loaded test data in a variety of table configs
> * alternate table creation and deletion
> * load additional table to cause !METADATA churn
> * shutdown cluster uncleanly
> * verified waiting Fate transactions (table deletion at success status)
> * verified waiting local WALs
> * verified waiting local WALs include !METADATA table (via LogReader)
> * verified /accumulo/version showed 4
> * Start upgrade to 1.5.2-SNAP
> * verified errors showing no upgrade and to go back to docs in: monitor, master logs, tabletserver logs
> * verified same waiting Fate transactions
> * verified same waiting local WALs
> * verified /accumulo/version showed 4
> * Cleared Fate operations
> * Start upgrade to 1.5.2-SNAP
> * wait a terrifying long amount of time, check on progress via local logs
> * verify no errors shown for upgrade
> * verified WALs copied to HDFS
> * verified /accumulo/version showed 5
> * verified monitor showed normal start up
> * wait for all tablets to be hosted
> * verify test data
> 
> 
> Thanks,
> 
> Sean Busbey
> 
>


Re: Review Request 19804: ACCUMULO-2519 Aborts upgrade if there are Fate transactions from an old version.

Posted by ke...@deenlo.com.

> On April 3, 2014, 8:08 p.m., kturner wrote:
> > server/src/main/java/org/apache/accumulo/server/master/Master.java, line 276
> > <https://reviews.apache.org/r/19804/diff/4/?file=545884#file545884line276>
> >
> >     this should be volatile because upgradeZookeeper() and upgradeMetadata() will be run by separate threads.
> 
> Sean Busbey wrote:
>     in reading the code, I thought upgradeZooKeeper had to happen prior to the thread that calls upgradeMetadata being created.  Am I reading the code wrong?
>     
>     Master.run() calls getMasterLock (which is synchronous) and then several lines later creates the status thread and starts it.

OK.  I did not look at the calling methods.  Actually, setMasterState() is whats synchronized, and that method calls both upgradeZooKeeper() and upgradeMetadata().  So different threads will see any changes other threads make to the boolean.  So it does not need to be volatile.

I also think upgradeZooKeeper will be called before upgradeMetadata.


- kturner


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19804/#review39468
-----------------------------------------------------------


On April 2, 2014, 3:10 p.m., Sean Busbey wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19804/
> -----------------------------------------------------------
> 
> (Updated April 2, 2014, 3:10 p.m.)
> 
> 
> Review request for accumulo and kturner.
> 
> 
> Bugs: ACCUMULO-2519
>     https://issues.apache.org/jira/browse/ACCUMULO-2519
> 
> 
> Repository: accumulo
> 
> 
> Description
> -------
> 
> Adds "make sure Fate has no outstanding items" to the upgrade instructions. Makes sure the master and tabletservers don't take upgrade steps if they see fate ops waiting.
> 
> 
> Diffs
> -----
> 
>   README 115a9b7 
>   server/src/main/java/org/apache/accumulo/server/Accumulo.java 99ec7e4 
>   server/src/main/java/org/apache/accumulo/server/master/Master.java 8c4c864 
>   server/src/main/java/org/apache/accumulo/server/tabletserver/TabletServer.java d76946d 
>   server/src/main/java/org/apache/accumulo/server/util/MetadataTable.java 7328a55 
> 
> Diff: https://reviews.apache.org/r/19804/diff/
> 
> 
> Testing
> -------
> 
> Took a 1.4.5-SNAP cluster
> 
> * loaded test data in a variety of table configs
> * alternate table creation and deletion
> * load additional table to cause !METADATA churn
> * shutdown cluster uncleanly
> * verified waiting Fate transactions (table deletion at success status)
> * verified waiting local WALs
> * verified waiting local WALs include !METADATA table (via LogReader)
> * verified /accumulo/version showed 4
> * Start upgrade to 1.5.2-SNAP
> * verified errors showing no upgrade and to go back to docs in: monitor, master logs, tabletserver logs
> * verified same waiting Fate transactions
> * verified same waiting local WALs
> * verified /accumulo/version showed 4
> * Cleared Fate operations
> * Start upgrade to 1.5.2-SNAP
> * wait a terrifying long amount of time, check on progress via local logs
> * verify no errors shown for upgrade
> * verified WALs copied to HDFS
> * verified /accumulo/version showed 5
> * verified monitor showed normal start up
> * wait for all tablets to be hosted
> * verify test data
> 
> 
> Thanks,
> 
> Sean Busbey
> 
>


Re: Review Request 19804: ACCUMULO-2519 Aborts upgrade if there are Fate transactions from an old version.

Posted by ke...@deenlo.com.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19804/#review39468
-----------------------------------------------------------



server/src/main/java/org/apache/accumulo/server/master/Master.java
<https://reviews.apache.org/r/19804/#comment71859>

    this should be volatile because upgradeZookeeper() and upgradeMetadata() will be run by separate threads.


- kturner


On April 2, 2014, 3:10 p.m., Sean Busbey wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19804/
> -----------------------------------------------------------
> 
> (Updated April 2, 2014, 3:10 p.m.)
> 
> 
> Review request for accumulo and kturner.
> 
> 
> Bugs: ACCUMULO-2519
>     https://issues.apache.org/jira/browse/ACCUMULO-2519
> 
> 
> Repository: accumulo
> 
> 
> Description
> -------
> 
> Adds "make sure Fate has no outstanding items" to the upgrade instructions. Makes sure the master and tabletservers don't take upgrade steps if they see fate ops waiting.
> 
> 
> Diffs
> -----
> 
>   README 115a9b7 
>   server/src/main/java/org/apache/accumulo/server/Accumulo.java 99ec7e4 
>   server/src/main/java/org/apache/accumulo/server/master/Master.java 8c4c864 
>   server/src/main/java/org/apache/accumulo/server/tabletserver/TabletServer.java d76946d 
>   server/src/main/java/org/apache/accumulo/server/util/MetadataTable.java 7328a55 
> 
> Diff: https://reviews.apache.org/r/19804/diff/
> 
> 
> Testing
> -------
> 
> Took a 1.4.5-SNAP cluster
> 
> * loaded test data in a variety of table configs
> * alternate table creation and deletion
> * load additional table to cause !METADATA churn
> * shutdown cluster uncleanly
> * verified waiting Fate transactions (table deletion at success status)
> * verified waiting local WALs
> * verified waiting local WALs include !METADATA table (via LogReader)
> * verified /accumulo/version showed 4
> * Start upgrade to 1.5.2-SNAP
> * verified errors showing no upgrade and to go back to docs in: monitor, master logs, tabletserver logs
> * verified same waiting Fate transactions
> * verified same waiting local WALs
> * verified /accumulo/version showed 4
> * Cleared Fate operations
> * Start upgrade to 1.5.2-SNAP
> * wait a terrifying long amount of time, check on progress via local logs
> * verify no errors shown for upgrade
> * verified WALs copied to HDFS
> * verified /accumulo/version showed 5
> * verified monitor showed normal start up
> * wait for all tablets to be hosted
> * verify test data
> 
> 
> Thanks,
> 
> Sean Busbey
> 
>


Re: Review Request 19804: ACCUMULO-2519 Aborts upgrade if there are Fate transactions from an old version.

Posted by Josh Elser <jo...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19804/#review39607
-----------------------------------------------------------

Ship it!


Ship It!

- Josh Elser


On April 4, 2014, 10:28 p.m., Sean Busbey wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19804/
> -----------------------------------------------------------
> 
> (Updated April 4, 2014, 10:28 p.m.)
> 
> 
> Review request for accumulo and kturner.
> 
> 
> Bugs: ACCUMULO-2519
>     https://issues.apache.org/jira/browse/ACCUMULO-2519
> 
> 
> Repository: accumulo
> 
> 
> Description
> -------
> 
> Adds "make sure Fate has no outstanding items" to the upgrade instructions. Makes sure the master and tabletservers don't take upgrade steps if they see fate ops waiting.
> 
> 
> Diffs
> -----
> 
>   README 115a9b7 
>   server/src/main/java/org/apache/accumulo/server/Accumulo.java 99ec7e4 
>   server/src/main/java/org/apache/accumulo/server/master/Master.java 8c4c864 
>   server/src/main/java/org/apache/accumulo/server/tabletserver/TabletServer.java d76946d 
>   server/src/main/java/org/apache/accumulo/server/util/MetadataTable.java 7328a55 
> 
> Diff: https://reviews.apache.org/r/19804/diff/
> 
> 
> Testing
> -------
> 
> Took a 1.4.5-SNAP cluster
> 
> * loaded test data in a variety of table configs
> * alternate table creation and deletion
> * load additional table to cause !METADATA churn
> * shutdown cluster uncleanly
> * verified waiting Fate transactions (table deletion at success status)
> * verified waiting local WALs
> * verified waiting local WALs include !METADATA table (via LogReader)
> * verified /accumulo/version showed 4
> * Start upgrade to 1.5.2-SNAP
> * verified errors showing no upgrade and to go back to docs in: monitor, master logs, tabletserver logs
> * verified same waiting Fate transactions
> * verified same waiting local WALs
> * verified /accumulo/version showed 4
> * Cleared Fate operations
> * Start upgrade to 1.5.2-SNAP
> * wait a terrifying long amount of time, check on progress via local logs
> * verify no errors shown for upgrade
> * verified WALs copied to HDFS
> * verified /accumulo/version showed 5
> * verified monitor showed normal start up
> * wait for all tablets to be hosted
> * verify test data
> 
> After merging forward to 1.6.0-SNAPSHOT branch:
> 
> Took 1.5.2-SNAP cluster from above test
> 
> * loaded additional test data in same variety of table configs
> * queue compactions from shell
> * load additional table to cause !METADATA churn
> * shutdown cluster uncleanly
> * verified waiting Fate transactions (compactions)
> * verified WALs in HDFS
> * verified WALs include !METADATA table (via LogReader)
> * verified /accumulo/version showed 5
> * Start upgrade to 1.6.0-SNAP
> * verified errors showing no upgrade and to go back to docs in: monitor, master logs
> * verified same waiting Fate transactions
> * verified same waiting hdfs WALs
> * verified /accumulo/version showed 5
> * Cleared Fate operations
> * start upgrade to 1.6.0-SNAP
> * Wait a good deal of time, though not as long as last time (largely for recover of ~7GB of WALs)
> * verify no errors shown for upgrade
> * verified /accumulo/version showed 6
> * verify restart cluster post-upgrade doesn't upgrade
> * verified monitor showed normal start up
> * waited for all tablets to be hosted
> * verify test data (both 1.4 written and 1.5 written)
> 
> 
> Thanks,
> 
> Sean Busbey
> 
>


Re: Review Request 19804: ACCUMULO-2519 Aborts upgrade if there are Fate transactions from an old version.

Posted by Sean Busbey <se...@manvsbeard.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19804/
-----------------------------------------------------------

(Updated April 4, 2014, 10:28 p.m.)


Review request for accumulo and kturner.


Bugs: ACCUMULO-2519
    https://issues.apache.org/jira/browse/ACCUMULO-2519


Repository: accumulo


Description
-------

Adds "make sure Fate has no outstanding items" to the upgrade instructions. Makes sure the master and tabletservers don't take upgrade steps if they see fate ops waiting.


Diffs
-----

  README 115a9b7 
  server/src/main/java/org/apache/accumulo/server/Accumulo.java 99ec7e4 
  server/src/main/java/org/apache/accumulo/server/master/Master.java 8c4c864 
  server/src/main/java/org/apache/accumulo/server/tabletserver/TabletServer.java d76946d 
  server/src/main/java/org/apache/accumulo/server/util/MetadataTable.java 7328a55 

Diff: https://reviews.apache.org/r/19804/diff/


Testing (updated)
-------

Took a 1.4.5-SNAP cluster

* loaded test data in a variety of table configs
* alternate table creation and deletion
* load additional table to cause !METADATA churn
* shutdown cluster uncleanly
* verified waiting Fate transactions (table deletion at success status)
* verified waiting local WALs
* verified waiting local WALs include !METADATA table (via LogReader)
* verified /accumulo/version showed 4
* Start upgrade to 1.5.2-SNAP
* verified errors showing no upgrade and to go back to docs in: monitor, master logs, tabletserver logs
* verified same waiting Fate transactions
* verified same waiting local WALs
* verified /accumulo/version showed 4
* Cleared Fate operations
* Start upgrade to 1.5.2-SNAP
* wait a terrifying long amount of time, check on progress via local logs
* verify no errors shown for upgrade
* verified WALs copied to HDFS
* verified /accumulo/version showed 5
* verified monitor showed normal start up
* wait for all tablets to be hosted
* verify test data

After merging forward to 1.6.0-SNAPSHOT branch:

Took 1.5.2-SNAP cluster from above test

* loaded additional test data in same variety of table configs
* queue compactions from shell
* load additional table to cause !METADATA churn
* shutdown cluster uncleanly
* verified waiting Fate transactions (compactions)
* verified WALs in HDFS
* verified WALs include !METADATA table (via LogReader)
* verified /accumulo/version showed 5
* Start upgrade to 1.6.0-SNAP
* verified errors showing no upgrade and to go back to docs in: monitor, master logs
* verified same waiting Fate transactions
* verified same waiting hdfs WALs
* verified /accumulo/version showed 5
* Cleared Fate operations
* start upgrade to 1.6.0-SNAP
* Wait a good deal of time, though not as long as last time (largely for recover of ~7GB of WALs)
* verify no errors shown for upgrade
* verified /accumulo/version showed 6
* verify restart cluster post-upgrade doesn't upgrade
* verified monitor showed normal start up
* waited for all tablets to be hosted
* verify test data (both 1.4 written and 1.5 written)


Thanks,

Sean Busbey


Re: Review Request 19804: ACCUMULO-2519 Aborts upgrade if there are Fate transactions from an old version.

Posted by ke...@deenlo.com.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19804/#review39476
-----------------------------------------------------------

Ship it!


Ship It!

- kturner


On April 2, 2014, 3:10 p.m., Sean Busbey wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19804/
> -----------------------------------------------------------
> 
> (Updated April 2, 2014, 3:10 p.m.)
> 
> 
> Review request for accumulo and kturner.
> 
> 
> Bugs: ACCUMULO-2519
>     https://issues.apache.org/jira/browse/ACCUMULO-2519
> 
> 
> Repository: accumulo
> 
> 
> Description
> -------
> 
> Adds "make sure Fate has no outstanding items" to the upgrade instructions. Makes sure the master and tabletservers don't take upgrade steps if they see fate ops waiting.
> 
> 
> Diffs
> -----
> 
>   README 115a9b7 
>   server/src/main/java/org/apache/accumulo/server/Accumulo.java 99ec7e4 
>   server/src/main/java/org/apache/accumulo/server/master/Master.java 8c4c864 
>   server/src/main/java/org/apache/accumulo/server/tabletserver/TabletServer.java d76946d 
>   server/src/main/java/org/apache/accumulo/server/util/MetadataTable.java 7328a55 
> 
> Diff: https://reviews.apache.org/r/19804/diff/
> 
> 
> Testing
> -------
> 
> Took a 1.4.5-SNAP cluster
> 
> * loaded test data in a variety of table configs
> * alternate table creation and deletion
> * load additional table to cause !METADATA churn
> * shutdown cluster uncleanly
> * verified waiting Fate transactions (table deletion at success status)
> * verified waiting local WALs
> * verified waiting local WALs include !METADATA table (via LogReader)
> * verified /accumulo/version showed 4
> * Start upgrade to 1.5.2-SNAP
> * verified errors showing no upgrade and to go back to docs in: monitor, master logs, tabletserver logs
> * verified same waiting Fate transactions
> * verified same waiting local WALs
> * verified /accumulo/version showed 4
> * Cleared Fate operations
> * Start upgrade to 1.5.2-SNAP
> * wait a terrifying long amount of time, check on progress via local logs
> * verify no errors shown for upgrade
> * verified WALs copied to HDFS
> * verified /accumulo/version showed 5
> * verified monitor showed normal start up
> * wait for all tablets to be hosted
> * verify test data
> 
> 
> Thanks,
> 
> Sean Busbey
> 
>


Re: Review Request 19804: ACCUMULO-2519 Aborts upgrade if there are Fate transactions from an old version.

Posted by Sean Busbey <se...@manvsbeard.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19804/
-----------------------------------------------------------

(Updated April 2, 2014, 3:10 p.m.)


Review request for accumulo and kturner.


Changes
-------

updated sanity checks and docs per Bill H's feedback.


Bugs: ACCUMULO-2519
    https://issues.apache.org/jira/browse/ACCUMULO-2519


Repository: accumulo


Description
-------

Adds "make sure Fate has no outstanding items" to the upgrade instructions. Makes sure the master and tabletservers don't take upgrade steps if they see fate ops waiting.


Diffs (updated)
-----

  README 115a9b7 
  server/src/main/java/org/apache/accumulo/server/Accumulo.java 99ec7e4 
  server/src/main/java/org/apache/accumulo/server/master/Master.java 8c4c864 
  server/src/main/java/org/apache/accumulo/server/tabletserver/TabletServer.java d76946d 
  server/src/main/java/org/apache/accumulo/server/util/MetadataTable.java 7328a55 

Diff: https://reviews.apache.org/r/19804/diff/


Testing
-------

Took a 1.4.5-SNAP cluster

* loaded test data in a variety of table configs
* alternate table creation and deletion
* load additional table to cause !METADATA churn
* shutdown cluster uncleanly
* verified waiting Fate transactions (table deletion at success status)
* verified waiting local WALs
* verified waiting local WALs include !METADATA table (via LogReader)
* verified /accumulo/version showed 4
* Start upgrade to 1.5.2-SNAP
* verified errors showing no upgrade and to go back to docs in: monitor, master logs, tabletserver logs
* verified same waiting Fate transactions
* verified same waiting local WALs
* verified /accumulo/version showed 4
* Cleared Fate operations
* Start upgrade to 1.5.2-SNAP
* wait a terrifying long amount of time, check on progress via local logs
* verify no errors shown for upgrade
* verified WALs copied to HDFS
* verified /accumulo/version showed 5
* verified monitor showed normal start up
* wait for all tablets to be hosted
* verify test data


Thanks,

Sean Busbey


Re: Review Request 19804: ACCUMULO-2519 Aborts upgrade if there are Fate transactions from an old version.

Posted by Sean Busbey <se...@manvsbeard.com>.

> On April 2, 2014, 2 p.m., Bill Havanki wrote:
> > README, line 61
> > <https://reviews.apache.org/r/19804/diff/3/?file=545050#file545050line61>
> >
> >     nit: "to delete"

fixed


> On April 2, 2014, 2 p.m., Bill Havanki wrote:
> > server/src/main/java/org/apache/accumulo/server/master/Master.java, line 288
> > <https://reviews.apache.org/r/19804/diff/3/?file=545052#file545052line288>
> >
> >     IllegalStateException would be even better to throw here (and other spots later on).

good idea!


- Sean


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19804/#review39251
-----------------------------------------------------------


On April 2, 2014, 7:50 a.m., Sean Busbey wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19804/
> -----------------------------------------------------------
> 
> (Updated April 2, 2014, 7:50 a.m.)
> 
> 
> Review request for accumulo and kturner.
> 
> 
> Bugs: ACCUMULO-2519
>     https://issues.apache.org/jira/browse/ACCUMULO-2519
> 
> 
> Repository: accumulo
> 
> 
> Description
> -------
> 
> Adds "make sure Fate has no outstanding items" to the upgrade instructions. Makes sure the master and tabletservers don't take upgrade steps if they see fate ops waiting.
> 
> 
> Diffs
> -----
> 
>   README 115a9b7 
>   server/src/main/java/org/apache/accumulo/server/Accumulo.java 99ec7e4 
>   server/src/main/java/org/apache/accumulo/server/master/Master.java 8c4c864 
>   server/src/main/java/org/apache/accumulo/server/tabletserver/TabletServer.java d76946d 
>   server/src/main/java/org/apache/accumulo/server/util/MetadataTable.java 7328a55 
> 
> Diff: https://reviews.apache.org/r/19804/diff/
> 
> 
> Testing
> -------
> 
> Took a 1.4.5-SNAP cluster
> 
> * loaded test data in a variety of table configs
> * alternate table creation and deletion
> * load additional table to cause !METADATA churn
> * shutdown cluster uncleanly
> * verified waiting Fate transactions (table deletion at success status)
> * verified waiting local WALs
> * verified waiting local WALs include !METADATA table (via LogReader)
> * verified /accumulo/version showed 4
> * Start upgrade to 1.5.2-SNAP
> * verified errors showing no upgrade and to go back to docs in: monitor, master logs, tabletserver logs
> * verified same waiting Fate transactions
> * verified same waiting local WALs
> * verified /accumulo/version showed 4
> * Cleared Fate operations
> * Start upgrade to 1.5.2-SNAP
> * wait a terrifying long amount of time, check on progress via local logs
> * verify no errors shown for upgrade
> * verified WALs copied to HDFS
> * verified /accumulo/version showed 5
> * verified monitor showed normal start up
> * wait for all tablets to be hosted
> * verify test data
> 
> 
> Thanks,
> 
> Sean Busbey
> 
>


Re: Review Request 19804: ACCUMULO-2519 Aborts upgrade if there are Fate transactions from an old version.

Posted by Bill Havanki <bh...@clouderagovt.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19804/#review39251
-----------------------------------------------------------



README
<https://reviews.apache.org/r/19804/#comment71613>

    nit: "to delete"



server/src/main/java/org/apache/accumulo/server/master/Master.java
<https://reviews.apache.org/r/19804/#comment71614>

    IllegalStateException would be even better to throw here (and other spots later on).


- Bill Havanki


On April 2, 2014, 3:50 a.m., Sean Busbey wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19804/
> -----------------------------------------------------------
> 
> (Updated April 2, 2014, 3:50 a.m.)
> 
> 
> Review request for accumulo and kturner.
> 
> 
> Bugs: ACCUMULO-2519
>     https://issues.apache.org/jira/browse/ACCUMULO-2519
> 
> 
> Repository: accumulo
> 
> 
> Description
> -------
> 
> Adds "make sure Fate has no outstanding items" to the upgrade instructions. Makes sure the master and tabletservers don't take upgrade steps if they see fate ops waiting.
> 
> 
> Diffs
> -----
> 
>   README 115a9b7 
>   server/src/main/java/org/apache/accumulo/server/Accumulo.java 99ec7e4 
>   server/src/main/java/org/apache/accumulo/server/master/Master.java 8c4c864 
>   server/src/main/java/org/apache/accumulo/server/tabletserver/TabletServer.java d76946d 
>   server/src/main/java/org/apache/accumulo/server/util/MetadataTable.java 7328a55 
> 
> Diff: https://reviews.apache.org/r/19804/diff/
> 
> 
> Testing
> -------
> 
> Took a 1.4.5-SNAP cluster
> 
> * loaded test data in a variety of table configs
> * alternate table creation and deletion
> * load additional table to cause !METADATA churn
> * shutdown cluster uncleanly
> * verified waiting Fate transactions (table deletion at success status)
> * verified waiting local WALs
> * verified waiting local WALs include !METADATA table (via LogReader)
> * verified /accumulo/version showed 4
> * Start upgrade to 1.5.2-SNAP
> * verified errors showing no upgrade and to go back to docs in: monitor, master logs, tabletserver logs
> * verified same waiting Fate transactions
> * verified same waiting local WALs
> * verified /accumulo/version showed 4
> * Cleared Fate operations
> * Start upgrade to 1.5.2-SNAP
> * wait a terrifying long amount of time, check on progress via local logs
> * verify no errors shown for upgrade
> * verified WALs copied to HDFS
> * verified /accumulo/version showed 5
> * verified monitor showed normal start up
> * wait for all tablets to be hosted
> * verify test data
> 
> 
> Thanks,
> 
> Sean Busbey
> 
>


Re: Review Request 19804: ACCUMULO-2519 Aborts upgrade if there are Fate transactions from an old version.

Posted by Sean Busbey <se...@manvsbeard.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19804/
-----------------------------------------------------------

(Updated April 2, 2014, 7:50 a.m.)


Review request for accumulo and kturner.


Changes
-------

updated docs to call out that completed fate operations will block and upgrade and can be deleted.


Bugs: ACCUMULO-2519
    https://issues.apache.org/jira/browse/ACCUMULO-2519


Repository: accumulo


Description
-------

Adds "make sure Fate has no outstanding items" to the upgrade instructions. Makes sure the master and tabletservers don't take upgrade steps if they see fate ops waiting.


Diffs (updated)
-----

  README 115a9b7 
  server/src/main/java/org/apache/accumulo/server/Accumulo.java 99ec7e4 
  server/src/main/java/org/apache/accumulo/server/master/Master.java 8c4c864 
  server/src/main/java/org/apache/accumulo/server/tabletserver/TabletServer.java d76946d 
  server/src/main/java/org/apache/accumulo/server/util/MetadataTable.java 7328a55 

Diff: https://reviews.apache.org/r/19804/diff/


Testing
-------

Took a 1.4.5-SNAP cluster

* loaded test data in a variety of table configs
* alternate table creation and deletion
* load additional table to cause !METADATA churn
* shutdown cluster uncleanly
* verified waiting Fate transactions (table deletion at success status)
* verified waiting local WALs
* verified waiting local WALs include !METADATA table (via LogReader)
* verified /accumulo/version showed 4
* Start upgrade to 1.5.2-SNAP
* verified errors showing no upgrade and to go back to docs in: monitor, master logs, tabletserver logs
* verified same waiting Fate transactions
* verified same waiting local WALs
* verified /accumulo/version showed 4
* Cleared Fate operations
* Start upgrade to 1.5.2-SNAP
* wait a terrifying long amount of time, check on progress via local logs
* verify no errors shown for upgrade
* verified WALs copied to HDFS
* verified /accumulo/version showed 5
* verified monitor showed normal start up
* wait for all tablets to be hosted
* verify test data


Thanks,

Sean Busbey