You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@aurora.apache.org by David McLaughlin <da...@dmclaughlin.com> on 2016/12/22 08:49:39 UTC

Review Request 54883: Move snapshots into a separate log

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54883/
-----------------------------------------------------------

Review request for Aurora, Joshua Cohen, John Sirois, Santhosh Kumar Shanmugham, and Zameer Manji.


Repository: aurora


Description
-------

See here for more details: https://docs.google.com/a/twitter.com/document/d/1BzJsx7rGAzJvIZnI-DSuVGD524tc7M2vPrH0HUORlfI/edit?usp=sharing

The motivation for this patch is the realisation that if we store Snapshots outside of the main transaction log, we only ever to need to hold the storage lock during snapshot creation. This would reduce the time writes are blocked during snapshots by 80% for a large production cluster. I'm also doing this with a view to future work, which is described in the document above. 

As long as everything works fine, this change is as simple as adding two new args to the scheduler JVM. But it's worth noting that once this change is deployed and the first snapshot is written, you will no longer be able to rollback without first restoring from a backup.


Diffs
-----

  api/src/main/thrift/org/apache/aurora/gen/storage.thrift 9e4213f13255a182df938bea44ca87fa03a25318 
  examples/vagrant/aurorabuild.sh dbec54d3e677db8cb0f02849e5373e619629fc7c 
  examples/vagrant/upstart/aurora-scheduler.conf e68a801017ae02e0ed581129e12a645ccc1e35d4 
  src/main/java/org/apache/aurora/scheduler/app/SchedulerMain.java 43cc5b4645bc26b0fc6b23726ad3292699048ded 
  src/main/java/org/apache/aurora/scheduler/log/Log.java dc77eb435e5f8fdce56727a2f679e8e1907e977c 
  src/main/java/org/apache/aurora/scheduler/log/mesos/DualLogModule.java PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLog.java 21855e184fe20dc339713978b32344b6950701ec 
  src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLogStreamModule.java 6704a328a4023a178ed8f86ae4772cb04eb2fa8e 
  src/main/java/org/apache/aurora/scheduler/storage/log/LogManager.java cfa9c56c909bac2e885e33a9fe772cb41cbd9fa9 
  src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java 387350c7667a5fb8ee674ad0d3dd17529232b25b 
  src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotDeduplicator.java 9733ffe74b107f336858657550156ddb1f1dd215 
  src/main/java/org/apache/aurora/scheduler/storage/log/StreamManager.java ea147c0ba6aaa6d113144be0a8330bd2f73d2453 
  src/main/java/org/apache/aurora/scheduler/storage/log/StreamManagerImpl.java baf2647c54f1f9918139584e5151873a3853e83c 
  src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 29a3b4a57925f31e59a49d4bfa630e724cadcb04 
  src/test/java/org/apache/aurora/scheduler/log/mesos/MesosLogTest.java f142f545799d64f9352b0ac6c51942eedf5e9ced 
  src/test/java/org/apache/aurora/scheduler/storage/log/LogManagerTest.java 3f445595a81a5655c6c486791a9b55d8dc7f2f76 
  src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java 34c24aac9339e59c435925bd2357ce86998f5f02 

Diff: https://reviews.apache.org/r/54883/diff/


Testing
-------

I currently have only tested this on one node (wrote this code on a flight). Will have to verify this in our test clusters before shipping.


Example log output on my vagrant image:

I1222 07:18:36.471 [qtp1117266705-37, LoggingInterceptor] snapshot() 
I1222 07:18:36.479 [qtp1117266705-37, LogStorage] Creating snapshot at position: Position 6 
I1222 07:18:36.485 [qtp1117266705-37, SnapshotStoreImpl] Saving dbsnapshot 
I1222 07:18:36.591 [qtp1117266705-37, LogStorage] Write lock released. 
I1222 07:18:36.591 [pool-12-thread-1, LogStorage] Persisting snapshot. 
I1222 07:18:36.592 [pool-12-thread-1, SnapshotDeduplicator$SnapshotDeduplicatorImpl] Starting deduplication of a snapshot with 0 tasks. 
I1222 07:18:36.595 [pool-12-thread-1, SnapshotDeduplicator$SnapshotDeduplicatorImpl] Finished deduplicating snapshot. Deduplication ratio: 0/0 = NaN%. 
I1222 07:18:36.599218  4155 log.cpp:577] Attempting to append 6261 bytes to the log
I1222 07:18:36.599321  4155 coordinator.cpp:348] Coordinator attempting to write APPEND action at position 5
I1222 07:18:36.599573  4160 replica.cpp:537] Replica received write request for position 5 from (11)@192.168.33.7:8083
I1222 07:18:36.600693  4160 leveldb.cpp:341] Persisting action (6280 bytes) to leveldb took 1.061504ms
I1222 07:18:36.600733  4160 replica.cpp:712] Persisted action at 5
I1222 07:18:36.600844  4160 replica.cpp:691] Replica received learned notice for position 5 from @0.0.0.0:0
I1222 07:18:36.601541  4160 leveldb.cpp:341] Persisting action (6282 bytes) to leveldb took 619657ns
I1222 07:18:36.601569  4160 replica.cpp:712] Persisted action at 5
I1222 07:18:36.601582  4160 replica.cpp:697] Replica learned APPEND action at position 5
I1222 07:18:36.603044  4160 log.cpp:596] Attempting to truncate the log to 5
I1222 07:18:36.603173  4160 coordinator.cpp:348] Coordinator attempting to write TRUNCATE action at position 6
I1222 07:18:36.603330  4160 replica.cpp:537] Replica received write request for position 6 from (12)@192.168.33.7:8083
I1222 07:18:36.603890  4160 leveldb.cpp:341] Persisting action (16 bytes) to leveldb took 502063ns
I1222 07:18:36.603919  4160 replica.cpp:712] Persisted action at 6
I1222 07:18:36.603991  4160 replica.cpp:691] Replica received learned notice for position 6 from @0.0.0.0:0
I1222 07:18:36.604279  4160 leveldb.cpp:341] Persisting action (18 bytes) to leveldb took 217834ns
I1222 07:18:36.604357  4160 leveldb.cpp:399] Deleting ~3 keys from leveldb took 13629ns
I1222 07:18:36.604372  4160 replica.cpp:712] Persisted action at 6
I1222 07:18:36.604383  4160 replica.cpp:697] Replica learned TRUNCATE action at position 6
I1222 07:18:36.604581  4160 log.cpp:596] Attempting to truncate the log to 6
I1222 07:18:36.604682  4160 coordinator.cpp:348] Coordinator attempting to write TRUNCATE action at position 7
I1222 07:18:36.604778  4160 replica.cpp:537] Replica received write request for position 7 from (13)@192.168.33.7:8083
I1222 07:18:36.605317  4160 leveldb.cpp:341] Persisting action (16 bytes) to leveldb took 485544ns
I1222 07:18:36.605356  4160 replica.cpp:712] Persisted action at 7
I1222 07:18:36.605439  4160 replica.cpp:691] Replica received learned notice for position 7 from @0.0.0.0:0
I1222 07:18:36.605736  4160 leveldb.cpp:341] Persisting action (18 bytes) to leveldb took 215503ns
I1222 07:18:36.605773  4160 leveldb.cpp:399] Deleting ~3 keys from leveldb took 9620ns
I1222 07:18:36.605787  4160 replica.cpp:712] Persisted action at 7
I1222 07:18:36.605798  4160 replica.cpp:697] Replica learned TRUNCATE action at position 7
I1222 07:18:36.605 [pool-12-thread-1, LogStorage] Snapshot complete. host attrs: 1, cron jobs: 0, locks: 0, quota confs: 0, tasks: 0, updates: 0


Thanks,

David McLaughlin


Re: Review Request 54883: Move snapshots into a separate log

Posted by Aurora ReviewBot <wf...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54883/#review159937
-----------------------------------------------------------



Master (38b9311) is red with this patch.
  ./build-support/jenkins/build.sh


:processJmhResources UP-TO-DATE
:jmhClasses
:checkstyleJmh
:jsHint
:checkstyleMain
:compileTestJava/home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/java/org/apache/aurora/scheduler/thrift/aop/MockDecoratedThrift.java:38: Note: Wrote forwarder org.apache.aurora.scheduler.thrift.aop.MockDecoratedThriftForwarder
@Forward(AnnotatedAuroraAdmin.class)
^
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.

:processTestResources
:testClasses
:checkstyleTest[ant:checkstyle] [ERROR] /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/java/org/apache/aurora/scheduler/log/mesos/MesosLogTest.java:16:8: Unused import - java.lang.reflect.Constructor. [UnusedImports]
[ant:checkstyle] [ERROR] /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/java/org/apache/aurora/scheduler/log/mesos/MesosLogTest.java:240:23: 'typecast' is not followed by whitespace. [WhitespaceAfter]
[ant:checkstyle] [ERROR] /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/java/org/apache/aurora/scheduler/storage/log/LogManagerTest.java:59:8: Unused import - org.apache.aurora.scheduler.log.Log. [UnusedImports]
[ant:checkstyle] [ERROR] /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/java/org/apache/aurora/scheduler/storage/log/LogManagerTest.java:393: Line is longer than 100 characters (found 113). [LineLength]
[ant:checkstyle] [ERROR] /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java:26:8: Unused import - com.google.common.base.Throwables. [UnusedImports]
[ant:checkstyle] [ERROR] /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java:116:8: Unused import - org.easymock.internal.MocksControl. [UnusedImports]
[ant:checkstyle] [ERROR] /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java:125:15: Unused import - org.easymock.EasyMock.anyObject. [UnusedImports]
[ant:checkstyle] [ERROR] /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java:131:15: Unused import - org.easymock.EasyMock.mock. [UnusedImports]
[ant:checkstyle] [ERROR] /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java:272:66: '{' is followed by whitespace. [NoWhitespaceAfter]
[ant:checkstyle] [ERROR] /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java:445:39: '{' is followed by whitespace. [NoWhitespaceAfter]
[ant:checkstyle] [ERROR] /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java:835: Line is longer than 100 characters (found 101). [LineLength]
[ant:checkstyle] [ERROR] /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java:1039: Line is longer than 100 characters (found 102). [LineLength]
 FAILED

FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':checkstyleTest'.
> Checkstyle rule violations were found. See the report at: file:///home/jenkins/jenkins-slave/workspace/AuroraBot/dist/reports/checkstyle/test.html

* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output.

BUILD FAILED

Total time: 1 mins 22.703 secs


I will refresh this build result if you post a review containing "@ReviewBot retry"

- Aurora ReviewBot


On Dec. 22, 2016, 9:05 a.m., David McLaughlin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54883/
> -----------------------------------------------------------
> 
> (Updated Dec. 22, 2016, 9:05 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, John Sirois, Santhosh Kumar Shanmugham, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> See here for more details: https://docs.google.com/a/dmclaughlin.com/document/d/1OE7IVAAQ5vZlYQMWcNCmpyTHKHQZCxw_nqjLIlJe3Vo/edit?usp=sharing
> 
> The motivation for this patch is the realisation that if we store Snapshots outside of the main transaction log, we only ever to need to hold the storage lock during snapshot creation. This would reduce the time writes are blocked during snapshots by 80% for a large production cluster. I'm also doing this with a view to future work, which is described in the document above. 
> 
> As long as everything works fine, this change is as simple as adding two new args to the scheduler JVM. But it's worth noting that once this change is deployed and the first snapshot is written, you will no longer be able to rollback without first restoring from a backup.
> 
> 
> Diffs
> -----
> 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 9e4213f13255a182df938bea44ca87fa03a25318 
>   examples/vagrant/aurorabuild.sh dbec54d3e677db8cb0f02849e5373e619629fc7c 
>   examples/vagrant/upstart/aurora-scheduler.conf e68a801017ae02e0ed581129e12a645ccc1e35d4 
>   src/main/java/org/apache/aurora/scheduler/app/SchedulerMain.java 43cc5b4645bc26b0fc6b23726ad3292699048ded 
>   src/main/java/org/apache/aurora/scheduler/log/Log.java dc77eb435e5f8fdce56727a2f679e8e1907e977c 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/DualLogModule.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLog.java 21855e184fe20dc339713978b32344b6950701ec 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLogStreamModule.java 6704a328a4023a178ed8f86ae4772cb04eb2fa8e 
>   src/main/java/org/apache/aurora/scheduler/storage/log/LogManager.java cfa9c56c909bac2e885e33a9fe772cb41cbd9fa9 
>   src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java 387350c7667a5fb8ee674ad0d3dd17529232b25b 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotDeduplicator.java 9733ffe74b107f336858657550156ddb1f1dd215 
>   src/main/java/org/apache/aurora/scheduler/storage/log/StreamManager.java ea147c0ba6aaa6d113144be0a8330bd2f73d2453 
>   src/main/java/org/apache/aurora/scheduler/storage/log/StreamManagerImpl.java baf2647c54f1f9918139584e5151873a3853e83c 
>   src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 29a3b4a57925f31e59a49d4bfa630e724cadcb04 
>   src/test/java/org/apache/aurora/scheduler/log/mesos/MesosLogTest.java f142f545799d64f9352b0ac6c51942eedf5e9ced 
>   src/test/java/org/apache/aurora/scheduler/storage/log/LogManagerTest.java 3f445595a81a5655c6c486791a9b55d8dc7f2f76 
>   src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java 34c24aac9339e59c435925bd2357ce86998f5f02 
> 
> Diff: https://reviews.apache.org/r/54883/diff/
> 
> 
> Testing
> -------
> 
> I currently have only tested this on one node (wrote this code on a flight). Will have to verify this in our test clusters before shipping.
> 
> 
> Example log output on my vagrant image:
> 
> I1222 07:18:36.471 [qtp1117266705-37, LoggingInterceptor] snapshot() 
> I1222 07:18:36.479 [qtp1117266705-37, LogStorage] Creating snapshot at position: Position 6 
> I1222 07:18:36.485 [qtp1117266705-37, SnapshotStoreImpl] Saving dbsnapshot 
> I1222 07:18:36.591 [qtp1117266705-37, LogStorage] Write lock released. 
> I1222 07:18:36.591 [pool-12-thread-1, LogStorage] Persisting snapshot. 
> I1222 07:18:36.592 [pool-12-thread-1, SnapshotDeduplicator$SnapshotDeduplicatorImpl] Starting deduplication of a snapshot with 0 tasks. 
> I1222 07:18:36.595 [pool-12-thread-1, SnapshotDeduplicator$SnapshotDeduplicatorImpl] Finished deduplicating snapshot. Deduplication ratio: 0/0 = NaN%. 
> I1222 07:18:36.599218  4155 log.cpp:577] Attempting to append 6261 bytes to the log
> I1222 07:18:36.599321  4155 coordinator.cpp:348] Coordinator attempting to write APPEND action at position 5
> I1222 07:18:36.599573  4160 replica.cpp:537] Replica received write request for position 5 from (11)@192.168.33.7:8083
> I1222 07:18:36.600693  4160 leveldb.cpp:341] Persisting action (6280 bytes) to leveldb took 1.061504ms
> I1222 07:18:36.600733  4160 replica.cpp:712] Persisted action at 5
> I1222 07:18:36.600844  4160 replica.cpp:691] Replica received learned notice for position 5 from @0.0.0.0:0
> I1222 07:18:36.601541  4160 leveldb.cpp:341] Persisting action (6282 bytes) to leveldb took 619657ns
> I1222 07:18:36.601569  4160 replica.cpp:712] Persisted action at 5
> I1222 07:18:36.601582  4160 replica.cpp:697] Replica learned APPEND action at position 5
> I1222 07:18:36.603044  4160 log.cpp:596] Attempting to truncate the log to 5
> I1222 07:18:36.603173  4160 coordinator.cpp:348] Coordinator attempting to write TRUNCATE action at position 6
> I1222 07:18:36.603330  4160 replica.cpp:537] Replica received write request for position 6 from (12)@192.168.33.7:8083
> I1222 07:18:36.603890  4160 leveldb.cpp:341] Persisting action (16 bytes) to leveldb took 502063ns
> I1222 07:18:36.603919  4160 replica.cpp:712] Persisted action at 6
> I1222 07:18:36.603991  4160 replica.cpp:691] Replica received learned notice for position 6 from @0.0.0.0:0
> I1222 07:18:36.604279  4160 leveldb.cpp:341] Persisting action (18 bytes) to leveldb took 217834ns
> I1222 07:18:36.604357  4160 leveldb.cpp:399] Deleting ~3 keys from leveldb took 13629ns
> I1222 07:18:36.604372  4160 replica.cpp:712] Persisted action at 6
> I1222 07:18:36.604383  4160 replica.cpp:697] Replica learned TRUNCATE action at position 6
> I1222 07:18:36.604581  4160 log.cpp:596] Attempting to truncate the log to 6
> I1222 07:18:36.604682  4160 coordinator.cpp:348] Coordinator attempting to write TRUNCATE action at position 7
> I1222 07:18:36.604778  4160 replica.cpp:537] Replica received write request for position 7 from (13)@192.168.33.7:8083
> I1222 07:18:36.605317  4160 leveldb.cpp:341] Persisting action (16 bytes) to leveldb took 485544ns
> I1222 07:18:36.605356  4160 replica.cpp:712] Persisted action at 7
> I1222 07:18:36.605439  4160 replica.cpp:691] Replica received learned notice for position 7 from @0.0.0.0:0
> I1222 07:18:36.605736  4160 leveldb.cpp:341] Persisting action (18 bytes) to leveldb took 215503ns
> I1222 07:18:36.605773  4160 leveldb.cpp:399] Deleting ~3 keys from leveldb took 9620ns
> I1222 07:18:36.605787  4160 replica.cpp:712] Persisted action at 7
> I1222 07:18:36.605798  4160 replica.cpp:697] Replica learned TRUNCATE action at position 7
> I1222 07:18:36.605 [pool-12-thread-1, LogStorage] Snapshot complete. host attrs: 1, cron jobs: 0, locks: 0, quota confs: 0, tasks: 0, updates: 0
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>


Re: Review Request 54883: Move snapshots into a separate log

Posted by David McLaughlin <da...@dmclaughlin.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54883/#review163423
-----------------------------------------------------------



Current status of this patch: we're working on having reliable scale testing suite at Twitter. When that is in place, we'll try this patch out and see what kind of impact there is.

- David McLaughlin


On Dec. 22, 2016, 9:26 a.m., David McLaughlin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54883/
> -----------------------------------------------------------
> 
> (Updated Dec. 22, 2016, 9:26 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, John Sirois, Santhosh Kumar Shanmugham, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> See here for more details: https://docs.google.com/document/d/1QVSEfeoCyt2D6cCmTCxy8-epufcuzIfnqRUkyT1betY/edit?usp=sharing
> 
> The motivation for this patch is the realisation that if we store Snapshots outside of the main transaction log, we only ever to need to hold the storage lock during snapshot creation. This would reduce the time writes are blocked during snapshots by 80% for a large production cluster. I'm also doing this with a view to future work, which is described in the document above. 
> 
> As long as everything works fine, this change is as simple as adding two new args to the scheduler JVM. But it's worth noting that once this change is deployed and the first snapshot is written, you will no longer be able to rollback without first restoring from a backup.
> 
> 
> Diffs
> -----
> 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 9e4213f13255a182df938bea44ca87fa03a25318 
>   examples/vagrant/aurorabuild.sh dbec54d3e677db8cb0f02849e5373e619629fc7c 
>   examples/vagrant/upstart/aurora-scheduler.conf e68a801017ae02e0ed581129e12a645ccc1e35d4 
>   src/main/java/org/apache/aurora/scheduler/app/SchedulerMain.java 43cc5b4645bc26b0fc6b23726ad3292699048ded 
>   src/main/java/org/apache/aurora/scheduler/log/Log.java dc77eb435e5f8fdce56727a2f679e8e1907e977c 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/DualLogModule.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLog.java 21855e184fe20dc339713978b32344b6950701ec 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLogStreamModule.java 6704a328a4023a178ed8f86ae4772cb04eb2fa8e 
>   src/main/java/org/apache/aurora/scheduler/storage/log/LogManager.java cfa9c56c909bac2e885e33a9fe772cb41cbd9fa9 
>   src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java 387350c7667a5fb8ee674ad0d3dd17529232b25b 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotDeduplicator.java 9733ffe74b107f336858657550156ddb1f1dd215 
>   src/main/java/org/apache/aurora/scheduler/storage/log/StreamManager.java ea147c0ba6aaa6d113144be0a8330bd2f73d2453 
>   src/main/java/org/apache/aurora/scheduler/storage/log/StreamManagerImpl.java baf2647c54f1f9918139584e5151873a3853e83c 
>   src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 29a3b4a57925f31e59a49d4bfa630e724cadcb04 
>   src/test/java/org/apache/aurora/scheduler/log/mesos/MesosLogTest.java f142f545799d64f9352b0ac6c51942eedf5e9ced 
>   src/test/java/org/apache/aurora/scheduler/storage/log/LogManagerTest.java 3f445595a81a5655c6c486791a9b55d8dc7f2f76 
>   src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java 34c24aac9339e59c435925bd2357ce86998f5f02 
> 
> Diff: https://reviews.apache.org/r/54883/diff/
> 
> 
> Testing
> -------
> 
> I currently have only tested this on one node (wrote this code on a flight). Will have to verify this in our test clusters before shipping.
> 
> 
> Example log output on my vagrant image:
> 
> I1222 07:18:36.471 [qtp1117266705-37, LoggingInterceptor] snapshot() 
> I1222 07:18:36.479 [qtp1117266705-37, LogStorage] Creating snapshot at position: Position 6 
> I1222 07:18:36.485 [qtp1117266705-37, SnapshotStoreImpl] Saving dbsnapshot 
> I1222 07:18:36.591 [qtp1117266705-37, LogStorage] Write lock released. 
> I1222 07:18:36.591 [pool-12-thread-1, LogStorage] Persisting snapshot. 
> I1222 07:18:36.592 [pool-12-thread-1, SnapshotDeduplicator$SnapshotDeduplicatorImpl] Starting deduplication of a snapshot with 0 tasks. 
> I1222 07:18:36.595 [pool-12-thread-1, SnapshotDeduplicator$SnapshotDeduplicatorImpl] Finished deduplicating snapshot. Deduplication ratio: 0/0 = NaN%. 
> I1222 07:18:36.599218  4155 log.cpp:577] Attempting to append 6261 bytes to the log
> I1222 07:18:36.599321  4155 coordinator.cpp:348] Coordinator attempting to write APPEND action at position 5
> I1222 07:18:36.599573  4160 replica.cpp:537] Replica received write request for position 5 from (11)@192.168.33.7:8083
> I1222 07:18:36.600693  4160 leveldb.cpp:341] Persisting action (6280 bytes) to leveldb took 1.061504ms
> I1222 07:18:36.600733  4160 replica.cpp:712] Persisted action at 5
> I1222 07:18:36.600844  4160 replica.cpp:691] Replica received learned notice for position 5 from @0.0.0.0:0
> I1222 07:18:36.601541  4160 leveldb.cpp:341] Persisting action (6282 bytes) to leveldb took 619657ns
> I1222 07:18:36.601569  4160 replica.cpp:712] Persisted action at 5
> I1222 07:18:36.601582  4160 replica.cpp:697] Replica learned APPEND action at position 5
> I1222 07:18:36.603044  4160 log.cpp:596] Attempting to truncate the log to 5
> I1222 07:18:36.603173  4160 coordinator.cpp:348] Coordinator attempting to write TRUNCATE action at position 6
> I1222 07:18:36.603330  4160 replica.cpp:537] Replica received write request for position 6 from (12)@192.168.33.7:8083
> I1222 07:18:36.603890  4160 leveldb.cpp:341] Persisting action (16 bytes) to leveldb took 502063ns
> I1222 07:18:36.603919  4160 replica.cpp:712] Persisted action at 6
> I1222 07:18:36.603991  4160 replica.cpp:691] Replica received learned notice for position 6 from @0.0.0.0:0
> I1222 07:18:36.604279  4160 leveldb.cpp:341] Persisting action (18 bytes) to leveldb took 217834ns
> I1222 07:18:36.604357  4160 leveldb.cpp:399] Deleting ~3 keys from leveldb took 13629ns
> I1222 07:18:36.604372  4160 replica.cpp:712] Persisted action at 6
> I1222 07:18:36.604383  4160 replica.cpp:697] Replica learned TRUNCATE action at position 6
> I1222 07:18:36.604581  4160 log.cpp:596] Attempting to truncate the log to 6
> I1222 07:18:36.604682  4160 coordinator.cpp:348] Coordinator attempting to write TRUNCATE action at position 7
> I1222 07:18:36.604778  4160 replica.cpp:537] Replica received write request for position 7 from (13)@192.168.33.7:8083
> I1222 07:18:36.605317  4160 leveldb.cpp:341] Persisting action (16 bytes) to leveldb took 485544ns
> I1222 07:18:36.605356  4160 replica.cpp:712] Persisted action at 7
> I1222 07:18:36.605439  4160 replica.cpp:691] Replica received learned notice for position 7 from @0.0.0.0:0
> I1222 07:18:36.605736  4160 leveldb.cpp:341] Persisting action (18 bytes) to leveldb took 215503ns
> I1222 07:18:36.605773  4160 leveldb.cpp:399] Deleting ~3 keys from leveldb took 9620ns
> I1222 07:18:36.605787  4160 replica.cpp:712] Persisted action at 7
> I1222 07:18:36.605798  4160 replica.cpp:697] Replica learned TRUNCATE action at position 7
> I1222 07:18:36.605 [pool-12-thread-1, LogStorage] Snapshot complete. host attrs: 1, cron jobs: 0, locks: 0, quota confs: 0, tasks: 0, updates: 0
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>


Re: Review Request 54883: Move snapshots into a separate log

Posted by Stephan Erb <se...@apache.org>.

> On Dec. 27, 2016, 11:50 p.m., Joshua Cohen wrote:
> > Overall this looks good to me. I think some work would be required to productionize it (e.g. make it optional to start). Obviously want to vet this in a test cluster and we would need some doc changes to go with this and RELEASE-NOTES.md, etc. but that's easy enough to take care of.
> > 
> > I also want to be sure I understand the behavior in the event of a problem with one, or both replicated logs. So... scenarios:
> > 
> > 1) One log or the other is missing/corrupt: our state becomes the sum of whatever's found in the log that's present/valid.
> > 2) There's no chance of a snapshot being persisted that references a log position that hasn't been persisted to the operation log, right? E.g. there's no realistic way that appending the noop transaction could take longer than creating/persisting the snapshot (in which case the snapshot could be successfully persisted referencing that position, yet that position would not exist in the operation log).
> 
> David McLaughlin wrote:
>     1) It's exactly the same as today if the log is corrupt - you need to restore from a backup. 
>     2) No, that is not possible. Log appends are synchronous. 
>     
>     I'm -1 to making this optional, purely because that doesn't reduce the risk in any way - you'd just be putting off adding flags to your start-up script. But yeah, it means I'll have to vet this change in a production-like environment before this gets committed.
> 
> Joshua Cohen wrote:
>     I'd like to hear what others running Aurora in production think about putting this behind a flag. This feels like a big change to force on people and I think if this came from someone outside of our org we'd like the flexibility to enable this on our own timeframe rather than being blocked from picking up all updates until we were comfortable/ready.
> 
> Stephan Erb wrote:
>     Most people (us included) only deploy major versions. If we ship this in 0.17 there will be many unrelated changes that might cause trouble and require a rollback. So, even if this change is OK, people could be bitten by other changes in the same release.
>     
>     I therefore agree with Joshua here that we need a way to make this tansition smoother.
> 
> David McLaughlin wrote:
>     A feature flag still doesn't solve that problem. You're just punting the migration to another release that will also have commits that might cause a rollback.

I still believe that a feature toggle would be helpful. I don't fear that this particular patch is broken, but rather that any of the hundreds of other changes within the release may come with a regression. Decoupling the backwards compatible from the incompatible parts just makes sure I can test the former before having to deal with the latter.  

An alternative solution would be to put this patch into a release of its own.


- Stephan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54883/#review160188
-----------------------------------------------------------


On Dec. 22, 2016, 10:26 a.m., David McLaughlin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54883/
> -----------------------------------------------------------
> 
> (Updated Dec. 22, 2016, 10:26 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, John Sirois, Santhosh Kumar Shanmugham, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> See here for more details: https://docs.google.com/document/d/1QVSEfeoCyt2D6cCmTCxy8-epufcuzIfnqRUkyT1betY/edit?usp=sharing
> 
> The motivation for this patch is the realisation that if we store Snapshots outside of the main transaction log, we only ever to need to hold the storage lock during snapshot creation. This would reduce the time writes are blocked during snapshots by 80% for a large production cluster. I'm also doing this with a view to future work, which is described in the document above. 
> 
> As long as everything works fine, this change is as simple as adding two new args to the scheduler JVM. But it's worth noting that once this change is deployed and the first snapshot is written, you will no longer be able to rollback without first restoring from a backup.
> 
> 
> Diffs
> -----
> 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 9e4213f13255a182df938bea44ca87fa03a25318 
>   examples/vagrant/aurorabuild.sh dbec54d3e677db8cb0f02849e5373e619629fc7c 
>   examples/vagrant/upstart/aurora-scheduler.conf e68a801017ae02e0ed581129e12a645ccc1e35d4 
>   src/main/java/org/apache/aurora/scheduler/app/SchedulerMain.java 43cc5b4645bc26b0fc6b23726ad3292699048ded 
>   src/main/java/org/apache/aurora/scheduler/log/Log.java dc77eb435e5f8fdce56727a2f679e8e1907e977c 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/DualLogModule.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLog.java 21855e184fe20dc339713978b32344b6950701ec 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLogStreamModule.java 6704a328a4023a178ed8f86ae4772cb04eb2fa8e 
>   src/main/java/org/apache/aurora/scheduler/storage/log/LogManager.java cfa9c56c909bac2e885e33a9fe772cb41cbd9fa9 
>   src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java 387350c7667a5fb8ee674ad0d3dd17529232b25b 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotDeduplicator.java 9733ffe74b107f336858657550156ddb1f1dd215 
>   src/main/java/org/apache/aurora/scheduler/storage/log/StreamManager.java ea147c0ba6aaa6d113144be0a8330bd2f73d2453 
>   src/main/java/org/apache/aurora/scheduler/storage/log/StreamManagerImpl.java baf2647c54f1f9918139584e5151873a3853e83c 
>   src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 29a3b4a57925f31e59a49d4bfa630e724cadcb04 
>   src/test/java/org/apache/aurora/scheduler/log/mesos/MesosLogTest.java f142f545799d64f9352b0ac6c51942eedf5e9ced 
>   src/test/java/org/apache/aurora/scheduler/storage/log/LogManagerTest.java 3f445595a81a5655c6c486791a9b55d8dc7f2f76 
>   src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java 34c24aac9339e59c435925bd2357ce86998f5f02 
> 
> Diff: https://reviews.apache.org/r/54883/diff/
> 
> 
> Testing
> -------
> 
> I currently have only tested this on one node (wrote this code on a flight). Will have to verify this in our test clusters before shipping.
> 
> 
> Example log output on my vagrant image:
> 
> I1222 07:18:36.471 [qtp1117266705-37, LoggingInterceptor] snapshot() 
> I1222 07:18:36.479 [qtp1117266705-37, LogStorage] Creating snapshot at position: Position 6 
> I1222 07:18:36.485 [qtp1117266705-37, SnapshotStoreImpl] Saving dbsnapshot 
> I1222 07:18:36.591 [qtp1117266705-37, LogStorage] Write lock released. 
> I1222 07:18:36.591 [pool-12-thread-1, LogStorage] Persisting snapshot. 
> I1222 07:18:36.592 [pool-12-thread-1, SnapshotDeduplicator$SnapshotDeduplicatorImpl] Starting deduplication of a snapshot with 0 tasks. 
> I1222 07:18:36.595 [pool-12-thread-1, SnapshotDeduplicator$SnapshotDeduplicatorImpl] Finished deduplicating snapshot. Deduplication ratio: 0/0 = NaN%. 
> I1222 07:18:36.599218  4155 log.cpp:577] Attempting to append 6261 bytes to the log
> I1222 07:18:36.599321  4155 coordinator.cpp:348] Coordinator attempting to write APPEND action at position 5
> I1222 07:18:36.599573  4160 replica.cpp:537] Replica received write request for position 5 from (11)@192.168.33.7:8083
> I1222 07:18:36.600693  4160 leveldb.cpp:341] Persisting action (6280 bytes) to leveldb took 1.061504ms
> I1222 07:18:36.600733  4160 replica.cpp:712] Persisted action at 5
> I1222 07:18:36.600844  4160 replica.cpp:691] Replica received learned notice for position 5 from @0.0.0.0:0
> I1222 07:18:36.601541  4160 leveldb.cpp:341] Persisting action (6282 bytes) to leveldb took 619657ns
> I1222 07:18:36.601569  4160 replica.cpp:712] Persisted action at 5
> I1222 07:18:36.601582  4160 replica.cpp:697] Replica learned APPEND action at position 5
> I1222 07:18:36.603044  4160 log.cpp:596] Attempting to truncate the log to 5
> I1222 07:18:36.603173  4160 coordinator.cpp:348] Coordinator attempting to write TRUNCATE action at position 6
> I1222 07:18:36.603330  4160 replica.cpp:537] Replica received write request for position 6 from (12)@192.168.33.7:8083
> I1222 07:18:36.603890  4160 leveldb.cpp:341] Persisting action (16 bytes) to leveldb took 502063ns
> I1222 07:18:36.603919  4160 replica.cpp:712] Persisted action at 6
> I1222 07:18:36.603991  4160 replica.cpp:691] Replica received learned notice for position 6 from @0.0.0.0:0
> I1222 07:18:36.604279  4160 leveldb.cpp:341] Persisting action (18 bytes) to leveldb took 217834ns
> I1222 07:18:36.604357  4160 leveldb.cpp:399] Deleting ~3 keys from leveldb took 13629ns
> I1222 07:18:36.604372  4160 replica.cpp:712] Persisted action at 6
> I1222 07:18:36.604383  4160 replica.cpp:697] Replica learned TRUNCATE action at position 6
> I1222 07:18:36.604581  4160 log.cpp:596] Attempting to truncate the log to 6
> I1222 07:18:36.604682  4160 coordinator.cpp:348] Coordinator attempting to write TRUNCATE action at position 7
> I1222 07:18:36.604778  4160 replica.cpp:537] Replica received write request for position 7 from (13)@192.168.33.7:8083
> I1222 07:18:36.605317  4160 leveldb.cpp:341] Persisting action (16 bytes) to leveldb took 485544ns
> I1222 07:18:36.605356  4160 replica.cpp:712] Persisted action at 7
> I1222 07:18:36.605439  4160 replica.cpp:691] Replica received learned notice for position 7 from @0.0.0.0:0
> I1222 07:18:36.605736  4160 leveldb.cpp:341] Persisting action (18 bytes) to leveldb took 215503ns
> I1222 07:18:36.605773  4160 leveldb.cpp:399] Deleting ~3 keys from leveldb took 9620ns
> I1222 07:18:36.605787  4160 replica.cpp:712] Persisted action at 7
> I1222 07:18:36.605798  4160 replica.cpp:697] Replica learned TRUNCATE action at position 7
> I1222 07:18:36.605 [pool-12-thread-1, LogStorage] Snapshot complete. host attrs: 1, cron jobs: 0, locks: 0, quota confs: 0, tasks: 0, updates: 0
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>


Re: Review Request 54883: Move snapshots into a separate log

Posted by Joshua Cohen <jc...@apache.org>.

> On Dec. 27, 2016, 10:50 p.m., Joshua Cohen wrote:
> > Overall this looks good to me. I think some work would be required to productionize it (e.g. make it optional to start). Obviously want to vet this in a test cluster and we would need some doc changes to go with this and RELEASE-NOTES.md, etc. but that's easy enough to take care of.
> > 
> > I also want to be sure I understand the behavior in the event of a problem with one, or both replicated logs. So... scenarios:
> > 
> > 1) One log or the other is missing/corrupt: our state becomes the sum of whatever's found in the log that's present/valid.
> > 2) There's no chance of a snapshot being persisted that references a log position that hasn't been persisted to the operation log, right? E.g. there's no realistic way that appending the noop transaction could take longer than creating/persisting the snapshot (in which case the snapshot could be successfully persisted referencing that position, yet that position would not exist in the operation log).
> 
> David McLaughlin wrote:
>     1) It's exactly the same as today if the log is corrupt - you need to restore from a backup. 
>     2) No, that is not possible. Log appends are synchronous. 
>     
>     I'm -1 to making this optional, purely because that doesn't reduce the risk in any way - you'd just be putting off adding flags to your start-up script. But yeah, it means I'll have to vet this change in a production-like environment before this gets committed.

I'd like to hear what others running Aurora in production think about putting this behind a flag. This feels like a big change to force on people and I think if this came from someone outside of our org we'd like the flexibility to enable this on our own timeframe rather than being blocked from picking up all updates until we were comfortable/ready.


- Joshua


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54883/#review160188
-----------------------------------------------------------


On Dec. 22, 2016, 9:26 a.m., David McLaughlin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54883/
> -----------------------------------------------------------
> 
> (Updated Dec. 22, 2016, 9:26 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, John Sirois, Santhosh Kumar Shanmugham, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> See here for more details: https://docs.google.com/document/d/1QVSEfeoCyt2D6cCmTCxy8-epufcuzIfnqRUkyT1betY/edit?usp=sharing
> 
> The motivation for this patch is the realisation that if we store Snapshots outside of the main transaction log, we only ever to need to hold the storage lock during snapshot creation. This would reduce the time writes are blocked during snapshots by 80% for a large production cluster. I'm also doing this with a view to future work, which is described in the document above. 
> 
> As long as everything works fine, this change is as simple as adding two new args to the scheduler JVM. But it's worth noting that once this change is deployed and the first snapshot is written, you will no longer be able to rollback without first restoring from a backup.
> 
> 
> Diffs
> -----
> 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 9e4213f13255a182df938bea44ca87fa03a25318 
>   examples/vagrant/aurorabuild.sh dbec54d3e677db8cb0f02849e5373e619629fc7c 
>   examples/vagrant/upstart/aurora-scheduler.conf e68a801017ae02e0ed581129e12a645ccc1e35d4 
>   src/main/java/org/apache/aurora/scheduler/app/SchedulerMain.java 43cc5b4645bc26b0fc6b23726ad3292699048ded 
>   src/main/java/org/apache/aurora/scheduler/log/Log.java dc77eb435e5f8fdce56727a2f679e8e1907e977c 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/DualLogModule.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLog.java 21855e184fe20dc339713978b32344b6950701ec 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLogStreamModule.java 6704a328a4023a178ed8f86ae4772cb04eb2fa8e 
>   src/main/java/org/apache/aurora/scheduler/storage/log/LogManager.java cfa9c56c909bac2e885e33a9fe772cb41cbd9fa9 
>   src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java 387350c7667a5fb8ee674ad0d3dd17529232b25b 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotDeduplicator.java 9733ffe74b107f336858657550156ddb1f1dd215 
>   src/main/java/org/apache/aurora/scheduler/storage/log/StreamManager.java ea147c0ba6aaa6d113144be0a8330bd2f73d2453 
>   src/main/java/org/apache/aurora/scheduler/storage/log/StreamManagerImpl.java baf2647c54f1f9918139584e5151873a3853e83c 
>   src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 29a3b4a57925f31e59a49d4bfa630e724cadcb04 
>   src/test/java/org/apache/aurora/scheduler/log/mesos/MesosLogTest.java f142f545799d64f9352b0ac6c51942eedf5e9ced 
>   src/test/java/org/apache/aurora/scheduler/storage/log/LogManagerTest.java 3f445595a81a5655c6c486791a9b55d8dc7f2f76 
>   src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java 34c24aac9339e59c435925bd2357ce86998f5f02 
> 
> Diff: https://reviews.apache.org/r/54883/diff/
> 
> 
> Testing
> -------
> 
> I currently have only tested this on one node (wrote this code on a flight). Will have to verify this in our test clusters before shipping.
> 
> 
> Example log output on my vagrant image:
> 
> I1222 07:18:36.471 [qtp1117266705-37, LoggingInterceptor] snapshot() 
> I1222 07:18:36.479 [qtp1117266705-37, LogStorage] Creating snapshot at position: Position 6 
> I1222 07:18:36.485 [qtp1117266705-37, SnapshotStoreImpl] Saving dbsnapshot 
> I1222 07:18:36.591 [qtp1117266705-37, LogStorage] Write lock released. 
> I1222 07:18:36.591 [pool-12-thread-1, LogStorage] Persisting snapshot. 
> I1222 07:18:36.592 [pool-12-thread-1, SnapshotDeduplicator$SnapshotDeduplicatorImpl] Starting deduplication of a snapshot with 0 tasks. 
> I1222 07:18:36.595 [pool-12-thread-1, SnapshotDeduplicator$SnapshotDeduplicatorImpl] Finished deduplicating snapshot. Deduplication ratio: 0/0 = NaN%. 
> I1222 07:18:36.599218  4155 log.cpp:577] Attempting to append 6261 bytes to the log
> I1222 07:18:36.599321  4155 coordinator.cpp:348] Coordinator attempting to write APPEND action at position 5
> I1222 07:18:36.599573  4160 replica.cpp:537] Replica received write request for position 5 from (11)@192.168.33.7:8083
> I1222 07:18:36.600693  4160 leveldb.cpp:341] Persisting action (6280 bytes) to leveldb took 1.061504ms
> I1222 07:18:36.600733  4160 replica.cpp:712] Persisted action at 5
> I1222 07:18:36.600844  4160 replica.cpp:691] Replica received learned notice for position 5 from @0.0.0.0:0
> I1222 07:18:36.601541  4160 leveldb.cpp:341] Persisting action (6282 bytes) to leveldb took 619657ns
> I1222 07:18:36.601569  4160 replica.cpp:712] Persisted action at 5
> I1222 07:18:36.601582  4160 replica.cpp:697] Replica learned APPEND action at position 5
> I1222 07:18:36.603044  4160 log.cpp:596] Attempting to truncate the log to 5
> I1222 07:18:36.603173  4160 coordinator.cpp:348] Coordinator attempting to write TRUNCATE action at position 6
> I1222 07:18:36.603330  4160 replica.cpp:537] Replica received write request for position 6 from (12)@192.168.33.7:8083
> I1222 07:18:36.603890  4160 leveldb.cpp:341] Persisting action (16 bytes) to leveldb took 502063ns
> I1222 07:18:36.603919  4160 replica.cpp:712] Persisted action at 6
> I1222 07:18:36.603991  4160 replica.cpp:691] Replica received learned notice for position 6 from @0.0.0.0:0
> I1222 07:18:36.604279  4160 leveldb.cpp:341] Persisting action (18 bytes) to leveldb took 217834ns
> I1222 07:18:36.604357  4160 leveldb.cpp:399] Deleting ~3 keys from leveldb took 13629ns
> I1222 07:18:36.604372  4160 replica.cpp:712] Persisted action at 6
> I1222 07:18:36.604383  4160 replica.cpp:697] Replica learned TRUNCATE action at position 6
> I1222 07:18:36.604581  4160 log.cpp:596] Attempting to truncate the log to 6
> I1222 07:18:36.604682  4160 coordinator.cpp:348] Coordinator attempting to write TRUNCATE action at position 7
> I1222 07:18:36.604778  4160 replica.cpp:537] Replica received write request for position 7 from (13)@192.168.33.7:8083
> I1222 07:18:36.605317  4160 leveldb.cpp:341] Persisting action (16 bytes) to leveldb took 485544ns
> I1222 07:18:36.605356  4160 replica.cpp:712] Persisted action at 7
> I1222 07:18:36.605439  4160 replica.cpp:691] Replica received learned notice for position 7 from @0.0.0.0:0
> I1222 07:18:36.605736  4160 leveldb.cpp:341] Persisting action (18 bytes) to leveldb took 215503ns
> I1222 07:18:36.605773  4160 leveldb.cpp:399] Deleting ~3 keys from leveldb took 9620ns
> I1222 07:18:36.605787  4160 replica.cpp:712] Persisted action at 7
> I1222 07:18:36.605798  4160 replica.cpp:697] Replica learned TRUNCATE action at position 7
> I1222 07:18:36.605 [pool-12-thread-1, LogStorage] Snapshot complete. host attrs: 1, cron jobs: 0, locks: 0, quota confs: 0, tasks: 0, updates: 0
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>


Re: Review Request 54883: Move snapshots into a separate log

Posted by David McLaughlin <da...@dmclaughlin.com>.

> On Dec. 27, 2016, 10:50 p.m., Joshua Cohen wrote:
> > Overall this looks good to me. I think some work would be required to productionize it (e.g. make it optional to start). Obviously want to vet this in a test cluster and we would need some doc changes to go with this and RELEASE-NOTES.md, etc. but that's easy enough to take care of.
> > 
> > I also want to be sure I understand the behavior in the event of a problem with one, or both replicated logs. So... scenarios:
> > 
> > 1) One log or the other is missing/corrupt: our state becomes the sum of whatever's found in the log that's present/valid.
> > 2) There's no chance of a snapshot being persisted that references a log position that hasn't been persisted to the operation log, right? E.g. there's no realistic way that appending the noop transaction could take longer than creating/persisting the snapshot (in which case the snapshot could be successfully persisted referencing that position, yet that position would not exist in the operation log).

1) It's exactly the same as today if the log is corrupt - you need to restore from a backup. 
2) No, that is not possible. Log appends are synchronous. 

I'm -1 to making this optional, purely because that doesn't reduce the risk in any way - you'd just be putting off adding flags to your start-up script. But yeah, it means I'll have to vet this change in a production-like environment before this gets committed.


> On Dec. 27, 2016, 10:50 p.m., Joshua Cohen wrote:
> > src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java, line 542
> > <https://reviews.apache.org/r/54883/diff/2/?file=1591095#file1591095line542>
> >
> >     This is probably a naive question, but why do we need to noop transaction? Can we not just `truncateBefore(snapshotPosition - 1)`? Are the log positions just not incremented in this fashion?

No, they are opaque references. Position is an empty interface, not a long. See the reply to Stephan above.


- David


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54883/#review160188
-----------------------------------------------------------


On Dec. 22, 2016, 9:26 a.m., David McLaughlin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54883/
> -----------------------------------------------------------
> 
> (Updated Dec. 22, 2016, 9:26 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, John Sirois, Santhosh Kumar Shanmugham, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> See here for more details: https://docs.google.com/document/d/1QVSEfeoCyt2D6cCmTCxy8-epufcuzIfnqRUkyT1betY/edit?usp=sharing
> 
> The motivation for this patch is the realisation that if we store Snapshots outside of the main transaction log, we only ever to need to hold the storage lock during snapshot creation. This would reduce the time writes are blocked during snapshots by 80% for a large production cluster. I'm also doing this with a view to future work, which is described in the document above. 
> 
> As long as everything works fine, this change is as simple as adding two new args to the scheduler JVM. But it's worth noting that once this change is deployed and the first snapshot is written, you will no longer be able to rollback without first restoring from a backup.
> 
> 
> Diffs
> -----
> 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 9e4213f13255a182df938bea44ca87fa03a25318 
>   examples/vagrant/aurorabuild.sh dbec54d3e677db8cb0f02849e5373e619629fc7c 
>   examples/vagrant/upstart/aurora-scheduler.conf e68a801017ae02e0ed581129e12a645ccc1e35d4 
>   src/main/java/org/apache/aurora/scheduler/app/SchedulerMain.java 43cc5b4645bc26b0fc6b23726ad3292699048ded 
>   src/main/java/org/apache/aurora/scheduler/log/Log.java dc77eb435e5f8fdce56727a2f679e8e1907e977c 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/DualLogModule.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLog.java 21855e184fe20dc339713978b32344b6950701ec 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLogStreamModule.java 6704a328a4023a178ed8f86ae4772cb04eb2fa8e 
>   src/main/java/org/apache/aurora/scheduler/storage/log/LogManager.java cfa9c56c909bac2e885e33a9fe772cb41cbd9fa9 
>   src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java 387350c7667a5fb8ee674ad0d3dd17529232b25b 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotDeduplicator.java 9733ffe74b107f336858657550156ddb1f1dd215 
>   src/main/java/org/apache/aurora/scheduler/storage/log/StreamManager.java ea147c0ba6aaa6d113144be0a8330bd2f73d2453 
>   src/main/java/org/apache/aurora/scheduler/storage/log/StreamManagerImpl.java baf2647c54f1f9918139584e5151873a3853e83c 
>   src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 29a3b4a57925f31e59a49d4bfa630e724cadcb04 
>   src/test/java/org/apache/aurora/scheduler/log/mesos/MesosLogTest.java f142f545799d64f9352b0ac6c51942eedf5e9ced 
>   src/test/java/org/apache/aurora/scheduler/storage/log/LogManagerTest.java 3f445595a81a5655c6c486791a9b55d8dc7f2f76 
>   src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java 34c24aac9339e59c435925bd2357ce86998f5f02 
> 
> Diff: https://reviews.apache.org/r/54883/diff/
> 
> 
> Testing
> -------
> 
> I currently have only tested this on one node (wrote this code on a flight). Will have to verify this in our test clusters before shipping.
> 
> 
> Example log output on my vagrant image:
> 
> I1222 07:18:36.471 [qtp1117266705-37, LoggingInterceptor] snapshot() 
> I1222 07:18:36.479 [qtp1117266705-37, LogStorage] Creating snapshot at position: Position 6 
> I1222 07:18:36.485 [qtp1117266705-37, SnapshotStoreImpl] Saving dbsnapshot 
> I1222 07:18:36.591 [qtp1117266705-37, LogStorage] Write lock released. 
> I1222 07:18:36.591 [pool-12-thread-1, LogStorage] Persisting snapshot. 
> I1222 07:18:36.592 [pool-12-thread-1, SnapshotDeduplicator$SnapshotDeduplicatorImpl] Starting deduplication of a snapshot with 0 tasks. 
> I1222 07:18:36.595 [pool-12-thread-1, SnapshotDeduplicator$SnapshotDeduplicatorImpl] Finished deduplicating snapshot. Deduplication ratio: 0/0 = NaN%. 
> I1222 07:18:36.599218  4155 log.cpp:577] Attempting to append 6261 bytes to the log
> I1222 07:18:36.599321  4155 coordinator.cpp:348] Coordinator attempting to write APPEND action at position 5
> I1222 07:18:36.599573  4160 replica.cpp:537] Replica received write request for position 5 from (11)@192.168.33.7:8083
> I1222 07:18:36.600693  4160 leveldb.cpp:341] Persisting action (6280 bytes) to leveldb took 1.061504ms
> I1222 07:18:36.600733  4160 replica.cpp:712] Persisted action at 5
> I1222 07:18:36.600844  4160 replica.cpp:691] Replica received learned notice for position 5 from @0.0.0.0:0
> I1222 07:18:36.601541  4160 leveldb.cpp:341] Persisting action (6282 bytes) to leveldb took 619657ns
> I1222 07:18:36.601569  4160 replica.cpp:712] Persisted action at 5
> I1222 07:18:36.601582  4160 replica.cpp:697] Replica learned APPEND action at position 5
> I1222 07:18:36.603044  4160 log.cpp:596] Attempting to truncate the log to 5
> I1222 07:18:36.603173  4160 coordinator.cpp:348] Coordinator attempting to write TRUNCATE action at position 6
> I1222 07:18:36.603330  4160 replica.cpp:537] Replica received write request for position 6 from (12)@192.168.33.7:8083
> I1222 07:18:36.603890  4160 leveldb.cpp:341] Persisting action (16 bytes) to leveldb took 502063ns
> I1222 07:18:36.603919  4160 replica.cpp:712] Persisted action at 6
> I1222 07:18:36.603991  4160 replica.cpp:691] Replica received learned notice for position 6 from @0.0.0.0:0
> I1222 07:18:36.604279  4160 leveldb.cpp:341] Persisting action (18 bytes) to leveldb took 217834ns
> I1222 07:18:36.604357  4160 leveldb.cpp:399] Deleting ~3 keys from leveldb took 13629ns
> I1222 07:18:36.604372  4160 replica.cpp:712] Persisted action at 6
> I1222 07:18:36.604383  4160 replica.cpp:697] Replica learned TRUNCATE action at position 6
> I1222 07:18:36.604581  4160 log.cpp:596] Attempting to truncate the log to 6
> I1222 07:18:36.604682  4160 coordinator.cpp:348] Coordinator attempting to write TRUNCATE action at position 7
> I1222 07:18:36.604778  4160 replica.cpp:537] Replica received write request for position 7 from (13)@192.168.33.7:8083
> I1222 07:18:36.605317  4160 leveldb.cpp:341] Persisting action (16 bytes) to leveldb took 485544ns
> I1222 07:18:36.605356  4160 replica.cpp:712] Persisted action at 7
> I1222 07:18:36.605439  4160 replica.cpp:691] Replica received learned notice for position 7 from @0.0.0.0:0
> I1222 07:18:36.605736  4160 leveldb.cpp:341] Persisting action (18 bytes) to leveldb took 215503ns
> I1222 07:18:36.605773  4160 leveldb.cpp:399] Deleting ~3 keys from leveldb took 9620ns
> I1222 07:18:36.605787  4160 replica.cpp:712] Persisted action at 7
> I1222 07:18:36.605798  4160 replica.cpp:697] Replica learned TRUNCATE action at position 7
> I1222 07:18:36.605 [pool-12-thread-1, LogStorage] Snapshot complete. host attrs: 1, cron jobs: 0, locks: 0, quota confs: 0, tasks: 0, updates: 0
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>


Re: Review Request 54883: Move snapshots into a separate log

Posted by David McLaughlin <da...@dmclaughlin.com>.

> On Dec. 27, 2016, 10:50 p.m., Joshua Cohen wrote:
> > Overall this looks good to me. I think some work would be required to productionize it (e.g. make it optional to start). Obviously want to vet this in a test cluster and we would need some doc changes to go with this and RELEASE-NOTES.md, etc. but that's easy enough to take care of.
> > 
> > I also want to be sure I understand the behavior in the event of a problem with one, or both replicated logs. So... scenarios:
> > 
> > 1) One log or the other is missing/corrupt: our state becomes the sum of whatever's found in the log that's present/valid.
> > 2) There's no chance of a snapshot being persisted that references a log position that hasn't been persisted to the operation log, right? E.g. there's no realistic way that appending the noop transaction could take longer than creating/persisting the snapshot (in which case the snapshot could be successfully persisted referencing that position, yet that position would not exist in the operation log).
> 
> David McLaughlin wrote:
>     1) It's exactly the same as today if the log is corrupt - you need to restore from a backup. 
>     2) No, that is not possible. Log appends are synchronous. 
>     
>     I'm -1 to making this optional, purely because that doesn't reduce the risk in any way - you'd just be putting off adding flags to your start-up script. But yeah, it means I'll have to vet this change in a production-like environment before this gets committed.
> 
> Joshua Cohen wrote:
>     I'd like to hear what others running Aurora in production think about putting this behind a flag. This feels like a big change to force on people and I think if this came from someone outside of our org we'd like the flexibility to enable this on our own timeframe rather than being blocked from picking up all updates until we were comfortable/ready.
> 
> Stephan Erb wrote:
>     Most people (us included) only deploy major versions. If we ship this in 0.17 there will be many unrelated changes that might cause trouble and require a rollback. So, even if this change is OK, people could be bitten by other changes in the same release.
>     
>     I therefore agree with Joshua here that we need a way to make this tansition smoother.

A feature flag still doesn't solve that problem. You're just punting the migration to another release that will also have commits that might cause a rollback.


- David


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54883/#review160188
-----------------------------------------------------------


On Dec. 22, 2016, 9:26 a.m., David McLaughlin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54883/
> -----------------------------------------------------------
> 
> (Updated Dec. 22, 2016, 9:26 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, John Sirois, Santhosh Kumar Shanmugham, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> See here for more details: https://docs.google.com/document/d/1QVSEfeoCyt2D6cCmTCxy8-epufcuzIfnqRUkyT1betY/edit?usp=sharing
> 
> The motivation for this patch is the realisation that if we store Snapshots outside of the main transaction log, we only ever to need to hold the storage lock during snapshot creation. This would reduce the time writes are blocked during snapshots by 80% for a large production cluster. I'm also doing this with a view to future work, which is described in the document above. 
> 
> As long as everything works fine, this change is as simple as adding two new args to the scheduler JVM. But it's worth noting that once this change is deployed and the first snapshot is written, you will no longer be able to rollback without first restoring from a backup.
> 
> 
> Diffs
> -----
> 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 9e4213f13255a182df938bea44ca87fa03a25318 
>   examples/vagrant/aurorabuild.sh dbec54d3e677db8cb0f02849e5373e619629fc7c 
>   examples/vagrant/upstart/aurora-scheduler.conf e68a801017ae02e0ed581129e12a645ccc1e35d4 
>   src/main/java/org/apache/aurora/scheduler/app/SchedulerMain.java 43cc5b4645bc26b0fc6b23726ad3292699048ded 
>   src/main/java/org/apache/aurora/scheduler/log/Log.java dc77eb435e5f8fdce56727a2f679e8e1907e977c 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/DualLogModule.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLog.java 21855e184fe20dc339713978b32344b6950701ec 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLogStreamModule.java 6704a328a4023a178ed8f86ae4772cb04eb2fa8e 
>   src/main/java/org/apache/aurora/scheduler/storage/log/LogManager.java cfa9c56c909bac2e885e33a9fe772cb41cbd9fa9 
>   src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java 387350c7667a5fb8ee674ad0d3dd17529232b25b 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotDeduplicator.java 9733ffe74b107f336858657550156ddb1f1dd215 
>   src/main/java/org/apache/aurora/scheduler/storage/log/StreamManager.java ea147c0ba6aaa6d113144be0a8330bd2f73d2453 
>   src/main/java/org/apache/aurora/scheduler/storage/log/StreamManagerImpl.java baf2647c54f1f9918139584e5151873a3853e83c 
>   src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 29a3b4a57925f31e59a49d4bfa630e724cadcb04 
>   src/test/java/org/apache/aurora/scheduler/log/mesos/MesosLogTest.java f142f545799d64f9352b0ac6c51942eedf5e9ced 
>   src/test/java/org/apache/aurora/scheduler/storage/log/LogManagerTest.java 3f445595a81a5655c6c486791a9b55d8dc7f2f76 
>   src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java 34c24aac9339e59c435925bd2357ce86998f5f02 
> 
> Diff: https://reviews.apache.org/r/54883/diff/
> 
> 
> Testing
> -------
> 
> I currently have only tested this on one node (wrote this code on a flight). Will have to verify this in our test clusters before shipping.
> 
> 
> Example log output on my vagrant image:
> 
> I1222 07:18:36.471 [qtp1117266705-37, LoggingInterceptor] snapshot() 
> I1222 07:18:36.479 [qtp1117266705-37, LogStorage] Creating snapshot at position: Position 6 
> I1222 07:18:36.485 [qtp1117266705-37, SnapshotStoreImpl] Saving dbsnapshot 
> I1222 07:18:36.591 [qtp1117266705-37, LogStorage] Write lock released. 
> I1222 07:18:36.591 [pool-12-thread-1, LogStorage] Persisting snapshot. 
> I1222 07:18:36.592 [pool-12-thread-1, SnapshotDeduplicator$SnapshotDeduplicatorImpl] Starting deduplication of a snapshot with 0 tasks. 
> I1222 07:18:36.595 [pool-12-thread-1, SnapshotDeduplicator$SnapshotDeduplicatorImpl] Finished deduplicating snapshot. Deduplication ratio: 0/0 = NaN%. 
> I1222 07:18:36.599218  4155 log.cpp:577] Attempting to append 6261 bytes to the log
> I1222 07:18:36.599321  4155 coordinator.cpp:348] Coordinator attempting to write APPEND action at position 5
> I1222 07:18:36.599573  4160 replica.cpp:537] Replica received write request for position 5 from (11)@192.168.33.7:8083
> I1222 07:18:36.600693  4160 leveldb.cpp:341] Persisting action (6280 bytes) to leveldb took 1.061504ms
> I1222 07:18:36.600733  4160 replica.cpp:712] Persisted action at 5
> I1222 07:18:36.600844  4160 replica.cpp:691] Replica received learned notice for position 5 from @0.0.0.0:0
> I1222 07:18:36.601541  4160 leveldb.cpp:341] Persisting action (6282 bytes) to leveldb took 619657ns
> I1222 07:18:36.601569  4160 replica.cpp:712] Persisted action at 5
> I1222 07:18:36.601582  4160 replica.cpp:697] Replica learned APPEND action at position 5
> I1222 07:18:36.603044  4160 log.cpp:596] Attempting to truncate the log to 5
> I1222 07:18:36.603173  4160 coordinator.cpp:348] Coordinator attempting to write TRUNCATE action at position 6
> I1222 07:18:36.603330  4160 replica.cpp:537] Replica received write request for position 6 from (12)@192.168.33.7:8083
> I1222 07:18:36.603890  4160 leveldb.cpp:341] Persisting action (16 bytes) to leveldb took 502063ns
> I1222 07:18:36.603919  4160 replica.cpp:712] Persisted action at 6
> I1222 07:18:36.603991  4160 replica.cpp:691] Replica received learned notice for position 6 from @0.0.0.0:0
> I1222 07:18:36.604279  4160 leveldb.cpp:341] Persisting action (18 bytes) to leveldb took 217834ns
> I1222 07:18:36.604357  4160 leveldb.cpp:399] Deleting ~3 keys from leveldb took 13629ns
> I1222 07:18:36.604372  4160 replica.cpp:712] Persisted action at 6
> I1222 07:18:36.604383  4160 replica.cpp:697] Replica learned TRUNCATE action at position 6
> I1222 07:18:36.604581  4160 log.cpp:596] Attempting to truncate the log to 6
> I1222 07:18:36.604682  4160 coordinator.cpp:348] Coordinator attempting to write TRUNCATE action at position 7
> I1222 07:18:36.604778  4160 replica.cpp:537] Replica received write request for position 7 from (13)@192.168.33.7:8083
> I1222 07:18:36.605317  4160 leveldb.cpp:341] Persisting action (16 bytes) to leveldb took 485544ns
> I1222 07:18:36.605356  4160 replica.cpp:712] Persisted action at 7
> I1222 07:18:36.605439  4160 replica.cpp:691] Replica received learned notice for position 7 from @0.0.0.0:0
> I1222 07:18:36.605736  4160 leveldb.cpp:341] Persisting action (18 bytes) to leveldb took 215503ns
> I1222 07:18:36.605773  4160 leveldb.cpp:399] Deleting ~3 keys from leveldb took 9620ns
> I1222 07:18:36.605787  4160 replica.cpp:712] Persisted action at 7
> I1222 07:18:36.605798  4160 replica.cpp:697] Replica learned TRUNCATE action at position 7
> I1222 07:18:36.605 [pool-12-thread-1, LogStorage] Snapshot complete. host attrs: 1, cron jobs: 0, locks: 0, quota confs: 0, tasks: 0, updates: 0
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>


Re: Review Request 54883: Move snapshots into a separate log

Posted by Stephan Erb <se...@apache.org>.

> On Dec. 27, 2016, 11:50 p.m., Joshua Cohen wrote:
> > Overall this looks good to me. I think some work would be required to productionize it (e.g. make it optional to start). Obviously want to vet this in a test cluster and we would need some doc changes to go with this and RELEASE-NOTES.md, etc. but that's easy enough to take care of.
> > 
> > I also want to be sure I understand the behavior in the event of a problem with one, or both replicated logs. So... scenarios:
> > 
> > 1) One log or the other is missing/corrupt: our state becomes the sum of whatever's found in the log that's present/valid.
> > 2) There's no chance of a snapshot being persisted that references a log position that hasn't been persisted to the operation log, right? E.g. there's no realistic way that appending the noop transaction could take longer than creating/persisting the snapshot (in which case the snapshot could be successfully persisted referencing that position, yet that position would not exist in the operation log).
> 
> David McLaughlin wrote:
>     1) It's exactly the same as today if the log is corrupt - you need to restore from a backup. 
>     2) No, that is not possible. Log appends are synchronous. 
>     
>     I'm -1 to making this optional, purely because that doesn't reduce the risk in any way - you'd just be putting off adding flags to your start-up script. But yeah, it means I'll have to vet this change in a production-like environment before this gets committed.
> 
> Joshua Cohen wrote:
>     I'd like to hear what others running Aurora in production think about putting this behind a flag. This feels like a big change to force on people and I think if this came from someone outside of our org we'd like the flexibility to enable this on our own timeframe rather than being blocked from picking up all updates until we were comfortable/ready.

Most people (us included) only deploy major versions. If we ship this in 0.17 there will be many unrelated changes that might cause trouble and require a rollback. So, even if this change is OK, people could be bitten by other changes in the same release.

I therefore agree with Joshua here that we need a way to make this tansition smoother.


- Stephan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54883/#review160188
-----------------------------------------------------------


On Dec. 22, 2016, 10:26 a.m., David McLaughlin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54883/
> -----------------------------------------------------------
> 
> (Updated Dec. 22, 2016, 10:26 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, John Sirois, Santhosh Kumar Shanmugham, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> See here for more details: https://docs.google.com/document/d/1QVSEfeoCyt2D6cCmTCxy8-epufcuzIfnqRUkyT1betY/edit?usp=sharing
> 
> The motivation for this patch is the realisation that if we store Snapshots outside of the main transaction log, we only ever to need to hold the storage lock during snapshot creation. This would reduce the time writes are blocked during snapshots by 80% for a large production cluster. I'm also doing this with a view to future work, which is described in the document above. 
> 
> As long as everything works fine, this change is as simple as adding two new args to the scheduler JVM. But it's worth noting that once this change is deployed and the first snapshot is written, you will no longer be able to rollback without first restoring from a backup.
> 
> 
> Diffs
> -----
> 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 9e4213f13255a182df938bea44ca87fa03a25318 
>   examples/vagrant/aurorabuild.sh dbec54d3e677db8cb0f02849e5373e619629fc7c 
>   examples/vagrant/upstart/aurora-scheduler.conf e68a801017ae02e0ed581129e12a645ccc1e35d4 
>   src/main/java/org/apache/aurora/scheduler/app/SchedulerMain.java 43cc5b4645bc26b0fc6b23726ad3292699048ded 
>   src/main/java/org/apache/aurora/scheduler/log/Log.java dc77eb435e5f8fdce56727a2f679e8e1907e977c 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/DualLogModule.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLog.java 21855e184fe20dc339713978b32344b6950701ec 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLogStreamModule.java 6704a328a4023a178ed8f86ae4772cb04eb2fa8e 
>   src/main/java/org/apache/aurora/scheduler/storage/log/LogManager.java cfa9c56c909bac2e885e33a9fe772cb41cbd9fa9 
>   src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java 387350c7667a5fb8ee674ad0d3dd17529232b25b 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotDeduplicator.java 9733ffe74b107f336858657550156ddb1f1dd215 
>   src/main/java/org/apache/aurora/scheduler/storage/log/StreamManager.java ea147c0ba6aaa6d113144be0a8330bd2f73d2453 
>   src/main/java/org/apache/aurora/scheduler/storage/log/StreamManagerImpl.java baf2647c54f1f9918139584e5151873a3853e83c 
>   src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 29a3b4a57925f31e59a49d4bfa630e724cadcb04 
>   src/test/java/org/apache/aurora/scheduler/log/mesos/MesosLogTest.java f142f545799d64f9352b0ac6c51942eedf5e9ced 
>   src/test/java/org/apache/aurora/scheduler/storage/log/LogManagerTest.java 3f445595a81a5655c6c486791a9b55d8dc7f2f76 
>   src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java 34c24aac9339e59c435925bd2357ce86998f5f02 
> 
> Diff: https://reviews.apache.org/r/54883/diff/
> 
> 
> Testing
> -------
> 
> I currently have only tested this on one node (wrote this code on a flight). Will have to verify this in our test clusters before shipping.
> 
> 
> Example log output on my vagrant image:
> 
> I1222 07:18:36.471 [qtp1117266705-37, LoggingInterceptor] snapshot() 
> I1222 07:18:36.479 [qtp1117266705-37, LogStorage] Creating snapshot at position: Position 6 
> I1222 07:18:36.485 [qtp1117266705-37, SnapshotStoreImpl] Saving dbsnapshot 
> I1222 07:18:36.591 [qtp1117266705-37, LogStorage] Write lock released. 
> I1222 07:18:36.591 [pool-12-thread-1, LogStorage] Persisting snapshot. 
> I1222 07:18:36.592 [pool-12-thread-1, SnapshotDeduplicator$SnapshotDeduplicatorImpl] Starting deduplication of a snapshot with 0 tasks. 
> I1222 07:18:36.595 [pool-12-thread-1, SnapshotDeduplicator$SnapshotDeduplicatorImpl] Finished deduplicating snapshot. Deduplication ratio: 0/0 = NaN%. 
> I1222 07:18:36.599218  4155 log.cpp:577] Attempting to append 6261 bytes to the log
> I1222 07:18:36.599321  4155 coordinator.cpp:348] Coordinator attempting to write APPEND action at position 5
> I1222 07:18:36.599573  4160 replica.cpp:537] Replica received write request for position 5 from (11)@192.168.33.7:8083
> I1222 07:18:36.600693  4160 leveldb.cpp:341] Persisting action (6280 bytes) to leveldb took 1.061504ms
> I1222 07:18:36.600733  4160 replica.cpp:712] Persisted action at 5
> I1222 07:18:36.600844  4160 replica.cpp:691] Replica received learned notice for position 5 from @0.0.0.0:0
> I1222 07:18:36.601541  4160 leveldb.cpp:341] Persisting action (6282 bytes) to leveldb took 619657ns
> I1222 07:18:36.601569  4160 replica.cpp:712] Persisted action at 5
> I1222 07:18:36.601582  4160 replica.cpp:697] Replica learned APPEND action at position 5
> I1222 07:18:36.603044  4160 log.cpp:596] Attempting to truncate the log to 5
> I1222 07:18:36.603173  4160 coordinator.cpp:348] Coordinator attempting to write TRUNCATE action at position 6
> I1222 07:18:36.603330  4160 replica.cpp:537] Replica received write request for position 6 from (12)@192.168.33.7:8083
> I1222 07:18:36.603890  4160 leveldb.cpp:341] Persisting action (16 bytes) to leveldb took 502063ns
> I1222 07:18:36.603919  4160 replica.cpp:712] Persisted action at 6
> I1222 07:18:36.603991  4160 replica.cpp:691] Replica received learned notice for position 6 from @0.0.0.0:0
> I1222 07:18:36.604279  4160 leveldb.cpp:341] Persisting action (18 bytes) to leveldb took 217834ns
> I1222 07:18:36.604357  4160 leveldb.cpp:399] Deleting ~3 keys from leveldb took 13629ns
> I1222 07:18:36.604372  4160 replica.cpp:712] Persisted action at 6
> I1222 07:18:36.604383  4160 replica.cpp:697] Replica learned TRUNCATE action at position 6
> I1222 07:18:36.604581  4160 log.cpp:596] Attempting to truncate the log to 6
> I1222 07:18:36.604682  4160 coordinator.cpp:348] Coordinator attempting to write TRUNCATE action at position 7
> I1222 07:18:36.604778  4160 replica.cpp:537] Replica received write request for position 7 from (13)@192.168.33.7:8083
> I1222 07:18:36.605317  4160 leveldb.cpp:341] Persisting action (16 bytes) to leveldb took 485544ns
> I1222 07:18:36.605356  4160 replica.cpp:712] Persisted action at 7
> I1222 07:18:36.605439  4160 replica.cpp:691] Replica received learned notice for position 7 from @0.0.0.0:0
> I1222 07:18:36.605736  4160 leveldb.cpp:341] Persisting action (18 bytes) to leveldb took 215503ns
> I1222 07:18:36.605773  4160 leveldb.cpp:399] Deleting ~3 keys from leveldb took 9620ns
> I1222 07:18:36.605787  4160 replica.cpp:712] Persisted action at 7
> I1222 07:18:36.605798  4160 replica.cpp:697] Replica learned TRUNCATE action at position 7
> I1222 07:18:36.605 [pool-12-thread-1, LogStorage] Snapshot complete. host attrs: 1, cron jobs: 0, locks: 0, quota confs: 0, tasks: 0, updates: 0
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>


Re: Review Request 54883: Move snapshots into a separate log

Posted by Joshua Cohen <jc...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54883/#review160188
-----------------------------------------------------------



Overall this looks good to me. I think some work would be required to productionize it (e.g. make it optional to start). Obviously want to vet this in a test cluster and we would need some doc changes to go with this and RELEASE-NOTES.md, etc. but that's easy enough to take care of.

I also want to be sure I understand the behavior in the event of a problem with one, or both replicated logs. So... scenarios:

1) One log or the other is missing/corrupt: our state becomes the sum of whatever's found in the log that's present/valid.
2) There's no chance of a snapshot being persisted that references a log position that hasn't been persisted to the operation log, right? E.g. there's no realistic way that appending the noop transaction could take longer than creating/persisting the snapshot (in which case the snapshot could be successfully persisted referencing that position, yet that position would not exist in the operation log).


src/main/java/org/apache/aurora/scheduler/log/mesos/DualLogModule.java (lines 33 - 34)
<https://reviews.apache.org/r/54883/#comment231274>

    Is there any benefit in having `MesosLogStreamModule` create the directory if an operator has to run `mesos-log --initialize` out of band? Should we just add an `@Exists` annotation to the args and drop the `mkdirs` call in `MesosLogStreamModule`?



src/main/java/org/apache/aurora/scheduler/log/mesos/DualLogModule.java (line 38)
<https://reviews.apache.org/r/54883/#comment231273>

    fix this to reference snapshot, rather than native log data.



src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLogStreamModule.java (line 107)
<https://reviews.apache.org/r/54883/#comment231263>

    nit, `Class` isn't reserved as part of a longer variable name, so just `logTypeClass`? Or better yet, `Class` is implied by the type, so just `logType`?



src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java (line 533)
<https://reviews.apache.org/r/54883/#comment231268>

    Use `{}` for log replacements instead of string concat (I thought we had checkstyle set up to alert on these now?).



src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java (line 542)
<https://reviews.apache.org/r/54883/#comment231269>

    This is probably a naive question, but why do we need to noop transaction? Can we not just `truncateBefore(snapshotPosition - 1)`? Are the log positions just not incremented in this fashion?



src/main/java/org/apache/aurora/scheduler/storage/log/StreamManager.java (line 87)
<https://reviews.apache.org/r/54883/#comment231270>

    nit: blank line between description and `@param`/`@return`



src/main/java/org/apache/aurora/scheduler/storage/log/StreamManager.java (lines 92 - 94)
<https://reviews.apache.org/r/54883/#comment231271>

    same here.



src/main/java/org/apache/aurora/scheduler/storage/log/StreamManagerImpl.java (line 106)
<https://reviews.apache.org/r/54883/#comment231272>

    +blank line before


- Joshua Cohen


On Dec. 22, 2016, 9:26 a.m., David McLaughlin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54883/
> -----------------------------------------------------------
> 
> (Updated Dec. 22, 2016, 9:26 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, John Sirois, Santhosh Kumar Shanmugham, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> See here for more details: https://docs.google.com/document/d/1QVSEfeoCyt2D6cCmTCxy8-epufcuzIfnqRUkyT1betY/edit?usp=sharing
> 
> The motivation for this patch is the realisation that if we store Snapshots outside of the main transaction log, we only ever to need to hold the storage lock during snapshot creation. This would reduce the time writes are blocked during snapshots by 80% for a large production cluster. I'm also doing this with a view to future work, which is described in the document above. 
> 
> As long as everything works fine, this change is as simple as adding two new args to the scheduler JVM. But it's worth noting that once this change is deployed and the first snapshot is written, you will no longer be able to rollback without first restoring from a backup.
> 
> 
> Diffs
> -----
> 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 9e4213f13255a182df938bea44ca87fa03a25318 
>   examples/vagrant/aurorabuild.sh dbec54d3e677db8cb0f02849e5373e619629fc7c 
>   examples/vagrant/upstart/aurora-scheduler.conf e68a801017ae02e0ed581129e12a645ccc1e35d4 
>   src/main/java/org/apache/aurora/scheduler/app/SchedulerMain.java 43cc5b4645bc26b0fc6b23726ad3292699048ded 
>   src/main/java/org/apache/aurora/scheduler/log/Log.java dc77eb435e5f8fdce56727a2f679e8e1907e977c 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/DualLogModule.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLog.java 21855e184fe20dc339713978b32344b6950701ec 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLogStreamModule.java 6704a328a4023a178ed8f86ae4772cb04eb2fa8e 
>   src/main/java/org/apache/aurora/scheduler/storage/log/LogManager.java cfa9c56c909bac2e885e33a9fe772cb41cbd9fa9 
>   src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java 387350c7667a5fb8ee674ad0d3dd17529232b25b 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotDeduplicator.java 9733ffe74b107f336858657550156ddb1f1dd215 
>   src/main/java/org/apache/aurora/scheduler/storage/log/StreamManager.java ea147c0ba6aaa6d113144be0a8330bd2f73d2453 
>   src/main/java/org/apache/aurora/scheduler/storage/log/StreamManagerImpl.java baf2647c54f1f9918139584e5151873a3853e83c 
>   src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 29a3b4a57925f31e59a49d4bfa630e724cadcb04 
>   src/test/java/org/apache/aurora/scheduler/log/mesos/MesosLogTest.java f142f545799d64f9352b0ac6c51942eedf5e9ced 
>   src/test/java/org/apache/aurora/scheduler/storage/log/LogManagerTest.java 3f445595a81a5655c6c486791a9b55d8dc7f2f76 
>   src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java 34c24aac9339e59c435925bd2357ce86998f5f02 
> 
> Diff: https://reviews.apache.org/r/54883/diff/
> 
> 
> Testing
> -------
> 
> I currently have only tested this on one node (wrote this code on a flight). Will have to verify this in our test clusters before shipping.
> 
> 
> Example log output on my vagrant image:
> 
> I1222 07:18:36.471 [qtp1117266705-37, LoggingInterceptor] snapshot() 
> I1222 07:18:36.479 [qtp1117266705-37, LogStorage] Creating snapshot at position: Position 6 
> I1222 07:18:36.485 [qtp1117266705-37, SnapshotStoreImpl] Saving dbsnapshot 
> I1222 07:18:36.591 [qtp1117266705-37, LogStorage] Write lock released. 
> I1222 07:18:36.591 [pool-12-thread-1, LogStorage] Persisting snapshot. 
> I1222 07:18:36.592 [pool-12-thread-1, SnapshotDeduplicator$SnapshotDeduplicatorImpl] Starting deduplication of a snapshot with 0 tasks. 
> I1222 07:18:36.595 [pool-12-thread-1, SnapshotDeduplicator$SnapshotDeduplicatorImpl] Finished deduplicating snapshot. Deduplication ratio: 0/0 = NaN%. 
> I1222 07:18:36.599218  4155 log.cpp:577] Attempting to append 6261 bytes to the log
> I1222 07:18:36.599321  4155 coordinator.cpp:348] Coordinator attempting to write APPEND action at position 5
> I1222 07:18:36.599573  4160 replica.cpp:537] Replica received write request for position 5 from (11)@192.168.33.7:8083
> I1222 07:18:36.600693  4160 leveldb.cpp:341] Persisting action (6280 bytes) to leveldb took 1.061504ms
> I1222 07:18:36.600733  4160 replica.cpp:712] Persisted action at 5
> I1222 07:18:36.600844  4160 replica.cpp:691] Replica received learned notice for position 5 from @0.0.0.0:0
> I1222 07:18:36.601541  4160 leveldb.cpp:341] Persisting action (6282 bytes) to leveldb took 619657ns
> I1222 07:18:36.601569  4160 replica.cpp:712] Persisted action at 5
> I1222 07:18:36.601582  4160 replica.cpp:697] Replica learned APPEND action at position 5
> I1222 07:18:36.603044  4160 log.cpp:596] Attempting to truncate the log to 5
> I1222 07:18:36.603173  4160 coordinator.cpp:348] Coordinator attempting to write TRUNCATE action at position 6
> I1222 07:18:36.603330  4160 replica.cpp:537] Replica received write request for position 6 from (12)@192.168.33.7:8083
> I1222 07:18:36.603890  4160 leveldb.cpp:341] Persisting action (16 bytes) to leveldb took 502063ns
> I1222 07:18:36.603919  4160 replica.cpp:712] Persisted action at 6
> I1222 07:18:36.603991  4160 replica.cpp:691] Replica received learned notice for position 6 from @0.0.0.0:0
> I1222 07:18:36.604279  4160 leveldb.cpp:341] Persisting action (18 bytes) to leveldb took 217834ns
> I1222 07:18:36.604357  4160 leveldb.cpp:399] Deleting ~3 keys from leveldb took 13629ns
> I1222 07:18:36.604372  4160 replica.cpp:712] Persisted action at 6
> I1222 07:18:36.604383  4160 replica.cpp:697] Replica learned TRUNCATE action at position 6
> I1222 07:18:36.604581  4160 log.cpp:596] Attempting to truncate the log to 6
> I1222 07:18:36.604682  4160 coordinator.cpp:348] Coordinator attempting to write TRUNCATE action at position 7
> I1222 07:18:36.604778  4160 replica.cpp:537] Replica received write request for position 7 from (13)@192.168.33.7:8083
> I1222 07:18:36.605317  4160 leveldb.cpp:341] Persisting action (16 bytes) to leveldb took 485544ns
> I1222 07:18:36.605356  4160 replica.cpp:712] Persisted action at 7
> I1222 07:18:36.605439  4160 replica.cpp:691] Replica received learned notice for position 7 from @0.0.0.0:0
> I1222 07:18:36.605736  4160 leveldb.cpp:341] Persisting action (18 bytes) to leveldb took 215503ns
> I1222 07:18:36.605773  4160 leveldb.cpp:399] Deleting ~3 keys from leveldb took 9620ns
> I1222 07:18:36.605787  4160 replica.cpp:712] Persisted action at 7
> I1222 07:18:36.605798  4160 replica.cpp:697] Replica learned TRUNCATE action at position 7
> I1222 07:18:36.605 [pool-12-thread-1, LogStorage] Snapshot complete. host attrs: 1, cron jobs: 0, locks: 0, quota confs: 0, tasks: 0, updates: 0
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>


Re: Review Request 54883: Move snapshots into a separate log

Posted by David McLaughlin <da...@dmclaughlin.com>.

> On Dec. 27, 2016, 4:19 p.m., Stephan Erb wrote:
> > api/src/main/thrift/org/apache/aurora/gen/storage.thrift, lines 156-157
> > <https://reviews.apache.org/r/54883/diff/2/?file=1591086#file1591086line156>
> >
> >     Is it somehow necessary to use a serialized position here? I have the gut feeling that a plain long would probably be better as it reduces coupling to replicated log interna.

Actually, this way reduces coupling to the Mesos log. After all, it's an implementation detail of the Mesos replicated log that it uses longs for ordering. In other implementations of distributed logs you might need a tuple as the id - e.g. with BookKeeper, you have a ledger id and sequence id.

Whoever wrote the Mesos code (identity is the only public method to get at the underlying value) and our log code (Position is an empty interface) did a good job with the abstractions there.


- David


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54883/#review160165
-----------------------------------------------------------


On Dec. 22, 2016, 9:26 a.m., David McLaughlin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54883/
> -----------------------------------------------------------
> 
> (Updated Dec. 22, 2016, 9:26 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, John Sirois, Santhosh Kumar Shanmugham, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> See here for more details: https://docs.google.com/document/d/1QVSEfeoCyt2D6cCmTCxy8-epufcuzIfnqRUkyT1betY/edit?usp=sharing
> 
> The motivation for this patch is the realisation that if we store Snapshots outside of the main transaction log, we only ever to need to hold the storage lock during snapshot creation. This would reduce the time writes are blocked during snapshots by 80% for a large production cluster. I'm also doing this with a view to future work, which is described in the document above. 
> 
> As long as everything works fine, this change is as simple as adding two new args to the scheduler JVM. But it's worth noting that once this change is deployed and the first snapshot is written, you will no longer be able to rollback without first restoring from a backup.
> 
> 
> Diffs
> -----
> 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 9e4213f13255a182df938bea44ca87fa03a25318 
>   examples/vagrant/aurorabuild.sh dbec54d3e677db8cb0f02849e5373e619629fc7c 
>   examples/vagrant/upstart/aurora-scheduler.conf e68a801017ae02e0ed581129e12a645ccc1e35d4 
>   src/main/java/org/apache/aurora/scheduler/app/SchedulerMain.java 43cc5b4645bc26b0fc6b23726ad3292699048ded 
>   src/main/java/org/apache/aurora/scheduler/log/Log.java dc77eb435e5f8fdce56727a2f679e8e1907e977c 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/DualLogModule.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLog.java 21855e184fe20dc339713978b32344b6950701ec 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLogStreamModule.java 6704a328a4023a178ed8f86ae4772cb04eb2fa8e 
>   src/main/java/org/apache/aurora/scheduler/storage/log/LogManager.java cfa9c56c909bac2e885e33a9fe772cb41cbd9fa9 
>   src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java 387350c7667a5fb8ee674ad0d3dd17529232b25b 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotDeduplicator.java 9733ffe74b107f336858657550156ddb1f1dd215 
>   src/main/java/org/apache/aurora/scheduler/storage/log/StreamManager.java ea147c0ba6aaa6d113144be0a8330bd2f73d2453 
>   src/main/java/org/apache/aurora/scheduler/storage/log/StreamManagerImpl.java baf2647c54f1f9918139584e5151873a3853e83c 
>   src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 29a3b4a57925f31e59a49d4bfa630e724cadcb04 
>   src/test/java/org/apache/aurora/scheduler/log/mesos/MesosLogTest.java f142f545799d64f9352b0ac6c51942eedf5e9ced 
>   src/test/java/org/apache/aurora/scheduler/storage/log/LogManagerTest.java 3f445595a81a5655c6c486791a9b55d8dc7f2f76 
>   src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java 34c24aac9339e59c435925bd2357ce86998f5f02 
> 
> Diff: https://reviews.apache.org/r/54883/diff/
> 
> 
> Testing
> -------
> 
> I currently have only tested this on one node (wrote this code on a flight). Will have to verify this in our test clusters before shipping.
> 
> 
> Example log output on my vagrant image:
> 
> I1222 07:18:36.471 [qtp1117266705-37, LoggingInterceptor] snapshot() 
> I1222 07:18:36.479 [qtp1117266705-37, LogStorage] Creating snapshot at position: Position 6 
> I1222 07:18:36.485 [qtp1117266705-37, SnapshotStoreImpl] Saving dbsnapshot 
> I1222 07:18:36.591 [qtp1117266705-37, LogStorage] Write lock released. 
> I1222 07:18:36.591 [pool-12-thread-1, LogStorage] Persisting snapshot. 
> I1222 07:18:36.592 [pool-12-thread-1, SnapshotDeduplicator$SnapshotDeduplicatorImpl] Starting deduplication of a snapshot with 0 tasks. 
> I1222 07:18:36.595 [pool-12-thread-1, SnapshotDeduplicator$SnapshotDeduplicatorImpl] Finished deduplicating snapshot. Deduplication ratio: 0/0 = NaN%. 
> I1222 07:18:36.599218  4155 log.cpp:577] Attempting to append 6261 bytes to the log
> I1222 07:18:36.599321  4155 coordinator.cpp:348] Coordinator attempting to write APPEND action at position 5
> I1222 07:18:36.599573  4160 replica.cpp:537] Replica received write request for position 5 from (11)@192.168.33.7:8083
> I1222 07:18:36.600693  4160 leveldb.cpp:341] Persisting action (6280 bytes) to leveldb took 1.061504ms
> I1222 07:18:36.600733  4160 replica.cpp:712] Persisted action at 5
> I1222 07:18:36.600844  4160 replica.cpp:691] Replica received learned notice for position 5 from @0.0.0.0:0
> I1222 07:18:36.601541  4160 leveldb.cpp:341] Persisting action (6282 bytes) to leveldb took 619657ns
> I1222 07:18:36.601569  4160 replica.cpp:712] Persisted action at 5
> I1222 07:18:36.601582  4160 replica.cpp:697] Replica learned APPEND action at position 5
> I1222 07:18:36.603044  4160 log.cpp:596] Attempting to truncate the log to 5
> I1222 07:18:36.603173  4160 coordinator.cpp:348] Coordinator attempting to write TRUNCATE action at position 6
> I1222 07:18:36.603330  4160 replica.cpp:537] Replica received write request for position 6 from (12)@192.168.33.7:8083
> I1222 07:18:36.603890  4160 leveldb.cpp:341] Persisting action (16 bytes) to leveldb took 502063ns
> I1222 07:18:36.603919  4160 replica.cpp:712] Persisted action at 6
> I1222 07:18:36.603991  4160 replica.cpp:691] Replica received learned notice for position 6 from @0.0.0.0:0
> I1222 07:18:36.604279  4160 leveldb.cpp:341] Persisting action (18 bytes) to leveldb took 217834ns
> I1222 07:18:36.604357  4160 leveldb.cpp:399] Deleting ~3 keys from leveldb took 13629ns
> I1222 07:18:36.604372  4160 replica.cpp:712] Persisted action at 6
> I1222 07:18:36.604383  4160 replica.cpp:697] Replica learned TRUNCATE action at position 6
> I1222 07:18:36.604581  4160 log.cpp:596] Attempting to truncate the log to 6
> I1222 07:18:36.604682  4160 coordinator.cpp:348] Coordinator attempting to write TRUNCATE action at position 7
> I1222 07:18:36.604778  4160 replica.cpp:537] Replica received write request for position 7 from (13)@192.168.33.7:8083
> I1222 07:18:36.605317  4160 leveldb.cpp:341] Persisting action (16 bytes) to leveldb took 485544ns
> I1222 07:18:36.605356  4160 replica.cpp:712] Persisted action at 7
> I1222 07:18:36.605439  4160 replica.cpp:691] Replica received learned notice for position 7 from @0.0.0.0:0
> I1222 07:18:36.605736  4160 leveldb.cpp:341] Persisting action (18 bytes) to leveldb took 215503ns
> I1222 07:18:36.605773  4160 leveldb.cpp:399] Deleting ~3 keys from leveldb took 9620ns
> I1222 07:18:36.605787  4160 replica.cpp:712] Persisted action at 7
> I1222 07:18:36.605798  4160 replica.cpp:697] Replica learned TRUNCATE action at position 7
> I1222 07:18:36.605 [pool-12-thread-1, LogStorage] Snapshot complete. host attrs: 1, cron jobs: 0, locks: 0, quota confs: 0, tasks: 0, updates: 0
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>


Re: Review Request 54883: Move snapshots into a separate log

Posted by Stephan Erb <se...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54883/#review160165
-----------------------------------------------------------



I like it! I only have a couple of minor remarks below.


api/src/main/thrift/org/apache/aurora/gen/storage.thrift (lines 156 - 157)
<https://reviews.apache.org/r/54883/#comment231219>

    Is it somehow necessary to use a serialized position here? I have the gut feeling that a plain long would probably be better as it reduces coupling to replicated log interna.



examples/vagrant/aurorabuild.sh (line 68)
<https://reviews.apache.org/r/54883/#comment231220>

    Our documentation will need to be updated for this additional step as well. (Of course, only once everything else has been settled :)



src/main/java/org/apache/aurora/scheduler/log/Log.java (lines 164 - 169)
<https://reviews.apache.org/r/54883/#comment231221>

    This factory should probably reside on the `Position` class instead.



src/main/java/org/apache/aurora/scheduler/log/mesos/DualLogModule.java (lines 37 - 40)
<https://reviews.apache.org/r/54883/#comment231222>

    I believe those new options should still be prefied with `native`. Other options like quorum and timeouts apply to this log as well. Using a commong prefix for all of them makes it easier to find the related options in the out put of `aurora-scheduler -help`.



src/main/java/org/apache/aurora/scheduler/log/mesos/DualLogModule.java (lines 46 - 48)
<https://reviews.apache.org/r/54883/#comment231223>

    Same as for `snapshot_log_file_path`.



src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLog.java (line 329)
<https://reviews.apache.org/r/54883/#comment231226>

    We probably want to track this for both logs independently.



src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java (line 233)
<https://reviews.apache.org/r/54883/#comment231224>

    In other places of the code base we use  `AsyncUtil.singleThreadLoggingScheduledExecutor(...)` instead.



src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotDeduplicator.java (lines 99 - 102)
<https://reviews.apache.org/r/54883/#comment231228>

    Could you please elaborate why we need this special case here? Do `setFieldValue` and `getFieldValue` have different types?



src/main/java/org/apache/aurora/scheduler/storage/log/StreamManagerImpl.java (lines 64 - 75)
<https://reviews.apache.org/r/54883/#comment231225>

    For debugging it will be helpful to differentiate those metrics for operations and snapshot log. As it stands the metric will be shared across both instances.


- Stephan Erb


On Dec. 22, 2016, 10:26 a.m., David McLaughlin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54883/
> -----------------------------------------------------------
> 
> (Updated Dec. 22, 2016, 10:26 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, John Sirois, Santhosh Kumar Shanmugham, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> See here for more details: https://docs.google.com/document/d/1QVSEfeoCyt2D6cCmTCxy8-epufcuzIfnqRUkyT1betY/edit?usp=sharing
> 
> The motivation for this patch is the realisation that if we store Snapshots outside of the main transaction log, we only ever to need to hold the storage lock during snapshot creation. This would reduce the time writes are blocked during snapshots by 80% for a large production cluster. I'm also doing this with a view to future work, which is described in the document above. 
> 
> As long as everything works fine, this change is as simple as adding two new args to the scheduler JVM. But it's worth noting that once this change is deployed and the first snapshot is written, you will no longer be able to rollback without first restoring from a backup.
> 
> 
> Diffs
> -----
> 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 9e4213f13255a182df938bea44ca87fa03a25318 
>   examples/vagrant/aurorabuild.sh dbec54d3e677db8cb0f02849e5373e619629fc7c 
>   examples/vagrant/upstart/aurora-scheduler.conf e68a801017ae02e0ed581129e12a645ccc1e35d4 
>   src/main/java/org/apache/aurora/scheduler/app/SchedulerMain.java 43cc5b4645bc26b0fc6b23726ad3292699048ded 
>   src/main/java/org/apache/aurora/scheduler/log/Log.java dc77eb435e5f8fdce56727a2f679e8e1907e977c 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/DualLogModule.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLog.java 21855e184fe20dc339713978b32344b6950701ec 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLogStreamModule.java 6704a328a4023a178ed8f86ae4772cb04eb2fa8e 
>   src/main/java/org/apache/aurora/scheduler/storage/log/LogManager.java cfa9c56c909bac2e885e33a9fe772cb41cbd9fa9 
>   src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java 387350c7667a5fb8ee674ad0d3dd17529232b25b 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotDeduplicator.java 9733ffe74b107f336858657550156ddb1f1dd215 
>   src/main/java/org/apache/aurora/scheduler/storage/log/StreamManager.java ea147c0ba6aaa6d113144be0a8330bd2f73d2453 
>   src/main/java/org/apache/aurora/scheduler/storage/log/StreamManagerImpl.java baf2647c54f1f9918139584e5151873a3853e83c 
>   src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 29a3b4a57925f31e59a49d4bfa630e724cadcb04 
>   src/test/java/org/apache/aurora/scheduler/log/mesos/MesosLogTest.java f142f545799d64f9352b0ac6c51942eedf5e9ced 
>   src/test/java/org/apache/aurora/scheduler/storage/log/LogManagerTest.java 3f445595a81a5655c6c486791a9b55d8dc7f2f76 
>   src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java 34c24aac9339e59c435925bd2357ce86998f5f02 
> 
> Diff: https://reviews.apache.org/r/54883/diff/
> 
> 
> Testing
> -------
> 
> I currently have only tested this on one node (wrote this code on a flight). Will have to verify this in our test clusters before shipping.
> 
> 
> Example log output on my vagrant image:
> 
> I1222 07:18:36.471 [qtp1117266705-37, LoggingInterceptor] snapshot() 
> I1222 07:18:36.479 [qtp1117266705-37, LogStorage] Creating snapshot at position: Position 6 
> I1222 07:18:36.485 [qtp1117266705-37, SnapshotStoreImpl] Saving dbsnapshot 
> I1222 07:18:36.591 [qtp1117266705-37, LogStorage] Write lock released. 
> I1222 07:18:36.591 [pool-12-thread-1, LogStorage] Persisting snapshot. 
> I1222 07:18:36.592 [pool-12-thread-1, SnapshotDeduplicator$SnapshotDeduplicatorImpl] Starting deduplication of a snapshot with 0 tasks. 
> I1222 07:18:36.595 [pool-12-thread-1, SnapshotDeduplicator$SnapshotDeduplicatorImpl] Finished deduplicating snapshot. Deduplication ratio: 0/0 = NaN%. 
> I1222 07:18:36.599218  4155 log.cpp:577] Attempting to append 6261 bytes to the log
> I1222 07:18:36.599321  4155 coordinator.cpp:348] Coordinator attempting to write APPEND action at position 5
> I1222 07:18:36.599573  4160 replica.cpp:537] Replica received write request for position 5 from (11)@192.168.33.7:8083
> I1222 07:18:36.600693  4160 leveldb.cpp:341] Persisting action (6280 bytes) to leveldb took 1.061504ms
> I1222 07:18:36.600733  4160 replica.cpp:712] Persisted action at 5
> I1222 07:18:36.600844  4160 replica.cpp:691] Replica received learned notice for position 5 from @0.0.0.0:0
> I1222 07:18:36.601541  4160 leveldb.cpp:341] Persisting action (6282 bytes) to leveldb took 619657ns
> I1222 07:18:36.601569  4160 replica.cpp:712] Persisted action at 5
> I1222 07:18:36.601582  4160 replica.cpp:697] Replica learned APPEND action at position 5
> I1222 07:18:36.603044  4160 log.cpp:596] Attempting to truncate the log to 5
> I1222 07:18:36.603173  4160 coordinator.cpp:348] Coordinator attempting to write TRUNCATE action at position 6
> I1222 07:18:36.603330  4160 replica.cpp:537] Replica received write request for position 6 from (12)@192.168.33.7:8083
> I1222 07:18:36.603890  4160 leveldb.cpp:341] Persisting action (16 bytes) to leveldb took 502063ns
> I1222 07:18:36.603919  4160 replica.cpp:712] Persisted action at 6
> I1222 07:18:36.603991  4160 replica.cpp:691] Replica received learned notice for position 6 from @0.0.0.0:0
> I1222 07:18:36.604279  4160 leveldb.cpp:341] Persisting action (18 bytes) to leveldb took 217834ns
> I1222 07:18:36.604357  4160 leveldb.cpp:399] Deleting ~3 keys from leveldb took 13629ns
> I1222 07:18:36.604372  4160 replica.cpp:712] Persisted action at 6
> I1222 07:18:36.604383  4160 replica.cpp:697] Replica learned TRUNCATE action at position 6
> I1222 07:18:36.604581  4160 log.cpp:596] Attempting to truncate the log to 6
> I1222 07:18:36.604682  4160 coordinator.cpp:348] Coordinator attempting to write TRUNCATE action at position 7
> I1222 07:18:36.604778  4160 replica.cpp:537] Replica received write request for position 7 from (13)@192.168.33.7:8083
> I1222 07:18:36.605317  4160 leveldb.cpp:341] Persisting action (16 bytes) to leveldb took 485544ns
> I1222 07:18:36.605356  4160 replica.cpp:712] Persisted action at 7
> I1222 07:18:36.605439  4160 replica.cpp:691] Replica received learned notice for position 7 from @0.0.0.0:0
> I1222 07:18:36.605736  4160 leveldb.cpp:341] Persisting action (18 bytes) to leveldb took 215503ns
> I1222 07:18:36.605773  4160 leveldb.cpp:399] Deleting ~3 keys from leveldb took 9620ns
> I1222 07:18:36.605787  4160 replica.cpp:712] Persisted action at 7
> I1222 07:18:36.605798  4160 replica.cpp:697] Replica learned TRUNCATE action at position 7
> I1222 07:18:36.605 [pool-12-thread-1, LogStorage] Snapshot complete. host attrs: 1, cron jobs: 0, locks: 0, quota confs: 0, tasks: 0, updates: 0
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>


Re: Review Request 54883: Move snapshots into a separate log

Posted by David McLaughlin <da...@dmclaughlin.com>.

> On Dec. 27, 2016, 7:42 p.m., Bill Farner wrote:
> > src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java, line 529
> > <https://reviews.apache.org/r/54883/diff/2/?file=1591095#file1591095line529>
> >
> >     I'm not following this sentence, and i can't connect it to the limitation of the truncate implementation above.  Can you take a stab at rephrasing?  Seems important to convey correctly.

So now I'm not sure if the description is bad or I'm worrying about nothing. 

FWIW, to explain the problem in more verbose terms:

If we have a sequence of Ops(https://github.com/apache/aurora/blob/master/api/src/main/thrift/org/apache/aurora/gen/storage.thrift#L95) and we have a log like:

    OP 4
    OP 3
    OP 2
    OP 1
    
Then after the whole snapshot process is complete (assuming no further entries during asynchronous snapshotting), we'd have two logs like:

    OP 4        SNAPSHOT (POSITION = 4)
    

And then when replaying, we'll apply the snapshot and then apply operation 4 again. If we can guarantee applying the same op twice in a row would always result in the same storage state as applying it once, then appending a noop is not needed.


- David


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54883/#review160184
-----------------------------------------------------------


On Dec. 22, 2016, 9:26 a.m., David McLaughlin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54883/
> -----------------------------------------------------------
> 
> (Updated Dec. 22, 2016, 9:26 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, John Sirois, Santhosh Kumar Shanmugham, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> See here for more details: https://docs.google.com/document/d/1QVSEfeoCyt2D6cCmTCxy8-epufcuzIfnqRUkyT1betY/edit?usp=sharing
> 
> The motivation for this patch is the realisation that if we store Snapshots outside of the main transaction log, we only ever to need to hold the storage lock during snapshot creation. This would reduce the time writes are blocked during snapshots by 80% for a large production cluster. I'm also doing this with a view to future work, which is described in the document above. 
> 
> As long as everything works fine, this change is as simple as adding two new args to the scheduler JVM. But it's worth noting that once this change is deployed and the first snapshot is written, you will no longer be able to rollback without first restoring from a backup.
> 
> 
> Diffs
> -----
> 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 9e4213f13255a182df938bea44ca87fa03a25318 
>   examples/vagrant/aurorabuild.sh dbec54d3e677db8cb0f02849e5373e619629fc7c 
>   examples/vagrant/upstart/aurora-scheduler.conf e68a801017ae02e0ed581129e12a645ccc1e35d4 
>   src/main/java/org/apache/aurora/scheduler/app/SchedulerMain.java 43cc5b4645bc26b0fc6b23726ad3292699048ded 
>   src/main/java/org/apache/aurora/scheduler/log/Log.java dc77eb435e5f8fdce56727a2f679e8e1907e977c 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/DualLogModule.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLog.java 21855e184fe20dc339713978b32344b6950701ec 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLogStreamModule.java 6704a328a4023a178ed8f86ae4772cb04eb2fa8e 
>   src/main/java/org/apache/aurora/scheduler/storage/log/LogManager.java cfa9c56c909bac2e885e33a9fe772cb41cbd9fa9 
>   src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java 387350c7667a5fb8ee674ad0d3dd17529232b25b 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotDeduplicator.java 9733ffe74b107f336858657550156ddb1f1dd215 
>   src/main/java/org/apache/aurora/scheduler/storage/log/StreamManager.java ea147c0ba6aaa6d113144be0a8330bd2f73d2453 
>   src/main/java/org/apache/aurora/scheduler/storage/log/StreamManagerImpl.java baf2647c54f1f9918139584e5151873a3853e83c 
>   src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 29a3b4a57925f31e59a49d4bfa630e724cadcb04 
>   src/test/java/org/apache/aurora/scheduler/log/mesos/MesosLogTest.java f142f545799d64f9352b0ac6c51942eedf5e9ced 
>   src/test/java/org/apache/aurora/scheduler/storage/log/LogManagerTest.java 3f445595a81a5655c6c486791a9b55d8dc7f2f76 
>   src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java 34c24aac9339e59c435925bd2357ce86998f5f02 
> 
> Diff: https://reviews.apache.org/r/54883/diff/
> 
> 
> Testing
> -------
> 
> I currently have only tested this on one node (wrote this code on a flight). Will have to verify this in our test clusters before shipping.
> 
> 
> Example log output on my vagrant image:
> 
> I1222 07:18:36.471 [qtp1117266705-37, LoggingInterceptor] snapshot() 
> I1222 07:18:36.479 [qtp1117266705-37, LogStorage] Creating snapshot at position: Position 6 
> I1222 07:18:36.485 [qtp1117266705-37, SnapshotStoreImpl] Saving dbsnapshot 
> I1222 07:18:36.591 [qtp1117266705-37, LogStorage] Write lock released. 
> I1222 07:18:36.591 [pool-12-thread-1, LogStorage] Persisting snapshot. 
> I1222 07:18:36.592 [pool-12-thread-1, SnapshotDeduplicator$SnapshotDeduplicatorImpl] Starting deduplication of a snapshot with 0 tasks. 
> I1222 07:18:36.595 [pool-12-thread-1, SnapshotDeduplicator$SnapshotDeduplicatorImpl] Finished deduplicating snapshot. Deduplication ratio: 0/0 = NaN%. 
> I1222 07:18:36.599218  4155 log.cpp:577] Attempting to append 6261 bytes to the log
> I1222 07:18:36.599321  4155 coordinator.cpp:348] Coordinator attempting to write APPEND action at position 5
> I1222 07:18:36.599573  4160 replica.cpp:537] Replica received write request for position 5 from (11)@192.168.33.7:8083
> I1222 07:18:36.600693  4160 leveldb.cpp:341] Persisting action (6280 bytes) to leveldb took 1.061504ms
> I1222 07:18:36.600733  4160 replica.cpp:712] Persisted action at 5
> I1222 07:18:36.600844  4160 replica.cpp:691] Replica received learned notice for position 5 from @0.0.0.0:0
> I1222 07:18:36.601541  4160 leveldb.cpp:341] Persisting action (6282 bytes) to leveldb took 619657ns
> I1222 07:18:36.601569  4160 replica.cpp:712] Persisted action at 5
> I1222 07:18:36.601582  4160 replica.cpp:697] Replica learned APPEND action at position 5
> I1222 07:18:36.603044  4160 log.cpp:596] Attempting to truncate the log to 5
> I1222 07:18:36.603173  4160 coordinator.cpp:348] Coordinator attempting to write TRUNCATE action at position 6
> I1222 07:18:36.603330  4160 replica.cpp:537] Replica received write request for position 6 from (12)@192.168.33.7:8083
> I1222 07:18:36.603890  4160 leveldb.cpp:341] Persisting action (16 bytes) to leveldb took 502063ns
> I1222 07:18:36.603919  4160 replica.cpp:712] Persisted action at 6
> I1222 07:18:36.603991  4160 replica.cpp:691] Replica received learned notice for position 6 from @0.0.0.0:0
> I1222 07:18:36.604279  4160 leveldb.cpp:341] Persisting action (18 bytes) to leveldb took 217834ns
> I1222 07:18:36.604357  4160 leveldb.cpp:399] Deleting ~3 keys from leveldb took 13629ns
> I1222 07:18:36.604372  4160 replica.cpp:712] Persisted action at 6
> I1222 07:18:36.604383  4160 replica.cpp:697] Replica learned TRUNCATE action at position 6
> I1222 07:18:36.604581  4160 log.cpp:596] Attempting to truncate the log to 6
> I1222 07:18:36.604682  4160 coordinator.cpp:348] Coordinator attempting to write TRUNCATE action at position 7
> I1222 07:18:36.604778  4160 replica.cpp:537] Replica received write request for position 7 from (13)@192.168.33.7:8083
> I1222 07:18:36.605317  4160 leveldb.cpp:341] Persisting action (16 bytes) to leveldb took 485544ns
> I1222 07:18:36.605356  4160 replica.cpp:712] Persisted action at 7
> I1222 07:18:36.605439  4160 replica.cpp:691] Replica received learned notice for position 7 from @0.0.0.0:0
> I1222 07:18:36.605736  4160 leveldb.cpp:341] Persisting action (18 bytes) to leveldb took 215503ns
> I1222 07:18:36.605773  4160 leveldb.cpp:399] Deleting ~3 keys from leveldb took 9620ns
> I1222 07:18:36.605787  4160 replica.cpp:712] Persisted action at 7
> I1222 07:18:36.605798  4160 replica.cpp:697] Replica learned TRUNCATE action at position 7
> I1222 07:18:36.605 [pool-12-thread-1, LogStorage] Snapshot complete. host attrs: 1, cron jobs: 0, locks: 0, quota confs: 0, tasks: 0, updates: 0
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>


Re: Review Request 54883: Move snapshots into a separate log

Posted by Bill Farner <wf...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54883/#review160184
-----------------------------------------------------------



The approach appears sound to me.  I did not study the code too thoroughly, but dropped a few thoughts.


src/main/java/org/apache/aurora/scheduler/log/Log.java (line 47)
<https://reviews.apache.org/r/54883/#comment231255>

    Nit: Consider calling this `TransactionLog` for more conventional terminology.



src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java (line 450)
<https://reviews.apache.org/r/54883/#comment231254>

    You may want to log in an `else`, to help pinpoint unexpected data.



src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java (line 529)
<https://reviews.apache.org/r/54883/#comment231256>

    I'm not following this sentence, and i can't connect it to the limitation of the truncate implementation above.  Can you take a stab at rephrasing?  Seems important to convey correctly.



src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java (line 534)
<https://reviews.apache.org/r/54883/#comment231257>

    Worth adding a TODO.  Repeating from your doc - there's non-trivial work done to create the final (deduped) `Snapshot` that doesn't need the storage lock.  Might be worth exploring at a future date.


- Bill Farner


On Dec. 22, 2016, 1:26 a.m., David McLaughlin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54883/
> -----------------------------------------------------------
> 
> (Updated Dec. 22, 2016, 1:26 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, John Sirois, Santhosh Kumar Shanmugham, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> See here for more details: https://docs.google.com/document/d/1QVSEfeoCyt2D6cCmTCxy8-epufcuzIfnqRUkyT1betY/edit?usp=sharing
> 
> The motivation for this patch is the realisation that if we store Snapshots outside of the main transaction log, we only ever to need to hold the storage lock during snapshot creation. This would reduce the time writes are blocked during snapshots by 80% for a large production cluster. I'm also doing this with a view to future work, which is described in the document above. 
> 
> As long as everything works fine, this change is as simple as adding two new args to the scheduler JVM. But it's worth noting that once this change is deployed and the first snapshot is written, you will no longer be able to rollback without first restoring from a backup.
> 
> 
> Diffs
> -----
> 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 9e4213f13255a182df938bea44ca87fa03a25318 
>   examples/vagrant/aurorabuild.sh dbec54d3e677db8cb0f02849e5373e619629fc7c 
>   examples/vagrant/upstart/aurora-scheduler.conf e68a801017ae02e0ed581129e12a645ccc1e35d4 
>   src/main/java/org/apache/aurora/scheduler/app/SchedulerMain.java 43cc5b4645bc26b0fc6b23726ad3292699048ded 
>   src/main/java/org/apache/aurora/scheduler/log/Log.java dc77eb435e5f8fdce56727a2f679e8e1907e977c 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/DualLogModule.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLog.java 21855e184fe20dc339713978b32344b6950701ec 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLogStreamModule.java 6704a328a4023a178ed8f86ae4772cb04eb2fa8e 
>   src/main/java/org/apache/aurora/scheduler/storage/log/LogManager.java cfa9c56c909bac2e885e33a9fe772cb41cbd9fa9 
>   src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java 387350c7667a5fb8ee674ad0d3dd17529232b25b 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotDeduplicator.java 9733ffe74b107f336858657550156ddb1f1dd215 
>   src/main/java/org/apache/aurora/scheduler/storage/log/StreamManager.java ea147c0ba6aaa6d113144be0a8330bd2f73d2453 
>   src/main/java/org/apache/aurora/scheduler/storage/log/StreamManagerImpl.java baf2647c54f1f9918139584e5151873a3853e83c 
>   src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 29a3b4a57925f31e59a49d4bfa630e724cadcb04 
>   src/test/java/org/apache/aurora/scheduler/log/mesos/MesosLogTest.java f142f545799d64f9352b0ac6c51942eedf5e9ced 
>   src/test/java/org/apache/aurora/scheduler/storage/log/LogManagerTest.java 3f445595a81a5655c6c486791a9b55d8dc7f2f76 
>   src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java 34c24aac9339e59c435925bd2357ce86998f5f02 
> 
> Diff: https://reviews.apache.org/r/54883/diff/
> 
> 
> Testing
> -------
> 
> I currently have only tested this on one node (wrote this code on a flight). Will have to verify this in our test clusters before shipping.
> 
> 
> Example log output on my vagrant image:
> 
> I1222 07:18:36.471 [qtp1117266705-37, LoggingInterceptor] snapshot() 
> I1222 07:18:36.479 [qtp1117266705-37, LogStorage] Creating snapshot at position: Position 6 
> I1222 07:18:36.485 [qtp1117266705-37, SnapshotStoreImpl] Saving dbsnapshot 
> I1222 07:18:36.591 [qtp1117266705-37, LogStorage] Write lock released. 
> I1222 07:18:36.591 [pool-12-thread-1, LogStorage] Persisting snapshot. 
> I1222 07:18:36.592 [pool-12-thread-1, SnapshotDeduplicator$SnapshotDeduplicatorImpl] Starting deduplication of a snapshot with 0 tasks. 
> I1222 07:18:36.595 [pool-12-thread-1, SnapshotDeduplicator$SnapshotDeduplicatorImpl] Finished deduplicating snapshot. Deduplication ratio: 0/0 = NaN%. 
> I1222 07:18:36.599218  4155 log.cpp:577] Attempting to append 6261 bytes to the log
> I1222 07:18:36.599321  4155 coordinator.cpp:348] Coordinator attempting to write APPEND action at position 5
> I1222 07:18:36.599573  4160 replica.cpp:537] Replica received write request for position 5 from (11)@192.168.33.7:8083
> I1222 07:18:36.600693  4160 leveldb.cpp:341] Persisting action (6280 bytes) to leveldb took 1.061504ms
> I1222 07:18:36.600733  4160 replica.cpp:712] Persisted action at 5
> I1222 07:18:36.600844  4160 replica.cpp:691] Replica received learned notice for position 5 from @0.0.0.0:0
> I1222 07:18:36.601541  4160 leveldb.cpp:341] Persisting action (6282 bytes) to leveldb took 619657ns
> I1222 07:18:36.601569  4160 replica.cpp:712] Persisted action at 5
> I1222 07:18:36.601582  4160 replica.cpp:697] Replica learned APPEND action at position 5
> I1222 07:18:36.603044  4160 log.cpp:596] Attempting to truncate the log to 5
> I1222 07:18:36.603173  4160 coordinator.cpp:348] Coordinator attempting to write TRUNCATE action at position 6
> I1222 07:18:36.603330  4160 replica.cpp:537] Replica received write request for position 6 from (12)@192.168.33.7:8083
> I1222 07:18:36.603890  4160 leveldb.cpp:341] Persisting action (16 bytes) to leveldb took 502063ns
> I1222 07:18:36.603919  4160 replica.cpp:712] Persisted action at 6
> I1222 07:18:36.603991  4160 replica.cpp:691] Replica received learned notice for position 6 from @0.0.0.0:0
> I1222 07:18:36.604279  4160 leveldb.cpp:341] Persisting action (18 bytes) to leveldb took 217834ns
> I1222 07:18:36.604357  4160 leveldb.cpp:399] Deleting ~3 keys from leveldb took 13629ns
> I1222 07:18:36.604372  4160 replica.cpp:712] Persisted action at 6
> I1222 07:18:36.604383  4160 replica.cpp:697] Replica learned TRUNCATE action at position 6
> I1222 07:18:36.604581  4160 log.cpp:596] Attempting to truncate the log to 6
> I1222 07:18:36.604682  4160 coordinator.cpp:348] Coordinator attempting to write TRUNCATE action at position 7
> I1222 07:18:36.604778  4160 replica.cpp:537] Replica received write request for position 7 from (13)@192.168.33.7:8083
> I1222 07:18:36.605317  4160 leveldb.cpp:341] Persisting action (16 bytes) to leveldb took 485544ns
> I1222 07:18:36.605356  4160 replica.cpp:712] Persisted action at 7
> I1222 07:18:36.605439  4160 replica.cpp:691] Replica received learned notice for position 7 from @0.0.0.0:0
> I1222 07:18:36.605736  4160 leveldb.cpp:341] Persisting action (18 bytes) to leveldb took 215503ns
> I1222 07:18:36.605773  4160 leveldb.cpp:399] Deleting ~3 keys from leveldb took 9620ns
> I1222 07:18:36.605787  4160 replica.cpp:712] Persisted action at 7
> I1222 07:18:36.605798  4160 replica.cpp:697] Replica learned TRUNCATE action at position 7
> I1222 07:18:36.605 [pool-12-thread-1, LogStorage] Snapshot complete. host attrs: 1, cron jobs: 0, locks: 0, quota confs: 0, tasks: 0, updates: 0
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>


Re: Review Request 54883: Move snapshots into a separate log

Posted by David McLaughlin <da...@dmclaughlin.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54883/
-----------------------------------------------------------

(Updated Dec. 22, 2016, 9:26 a.m.)


Review request for Aurora, Joshua Cohen, John Sirois, Santhosh Kumar Shanmugham, and Zameer Manji.


Changes
-------

Try again with the proposal doc.


Repository: aurora


Description (updated)
-------

See here for more details: https://docs.google.com/document/d/1QVSEfeoCyt2D6cCmTCxy8-epufcuzIfnqRUkyT1betY/edit?usp=sharing

The motivation for this patch is the realisation that if we store Snapshots outside of the main transaction log, we only ever to need to hold the storage lock during snapshot creation. This would reduce the time writes are blocked during snapshots by 80% for a large production cluster. I'm also doing this with a view to future work, which is described in the document above. 

As long as everything works fine, this change is as simple as adding two new args to the scheduler JVM. But it's worth noting that once this change is deployed and the first snapshot is written, you will no longer be able to rollback without first restoring from a backup.


Diffs
-----

  api/src/main/thrift/org/apache/aurora/gen/storage.thrift 9e4213f13255a182df938bea44ca87fa03a25318 
  examples/vagrant/aurorabuild.sh dbec54d3e677db8cb0f02849e5373e619629fc7c 
  examples/vagrant/upstart/aurora-scheduler.conf e68a801017ae02e0ed581129e12a645ccc1e35d4 
  src/main/java/org/apache/aurora/scheduler/app/SchedulerMain.java 43cc5b4645bc26b0fc6b23726ad3292699048ded 
  src/main/java/org/apache/aurora/scheduler/log/Log.java dc77eb435e5f8fdce56727a2f679e8e1907e977c 
  src/main/java/org/apache/aurora/scheduler/log/mesos/DualLogModule.java PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLog.java 21855e184fe20dc339713978b32344b6950701ec 
  src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLogStreamModule.java 6704a328a4023a178ed8f86ae4772cb04eb2fa8e 
  src/main/java/org/apache/aurora/scheduler/storage/log/LogManager.java cfa9c56c909bac2e885e33a9fe772cb41cbd9fa9 
  src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java 387350c7667a5fb8ee674ad0d3dd17529232b25b 
  src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotDeduplicator.java 9733ffe74b107f336858657550156ddb1f1dd215 
  src/main/java/org/apache/aurora/scheduler/storage/log/StreamManager.java ea147c0ba6aaa6d113144be0a8330bd2f73d2453 
  src/main/java/org/apache/aurora/scheduler/storage/log/StreamManagerImpl.java baf2647c54f1f9918139584e5151873a3853e83c 
  src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 29a3b4a57925f31e59a49d4bfa630e724cadcb04 
  src/test/java/org/apache/aurora/scheduler/log/mesos/MesosLogTest.java f142f545799d64f9352b0ac6c51942eedf5e9ced 
  src/test/java/org/apache/aurora/scheduler/storage/log/LogManagerTest.java 3f445595a81a5655c6c486791a9b55d8dc7f2f76 
  src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java 34c24aac9339e59c435925bd2357ce86998f5f02 

Diff: https://reviews.apache.org/r/54883/diff/


Testing
-------

I currently have only tested this on one node (wrote this code on a flight). Will have to verify this in our test clusters before shipping.


Example log output on my vagrant image:

I1222 07:18:36.471 [qtp1117266705-37, LoggingInterceptor] snapshot() 
I1222 07:18:36.479 [qtp1117266705-37, LogStorage] Creating snapshot at position: Position 6 
I1222 07:18:36.485 [qtp1117266705-37, SnapshotStoreImpl] Saving dbsnapshot 
I1222 07:18:36.591 [qtp1117266705-37, LogStorage] Write lock released. 
I1222 07:18:36.591 [pool-12-thread-1, LogStorage] Persisting snapshot. 
I1222 07:18:36.592 [pool-12-thread-1, SnapshotDeduplicator$SnapshotDeduplicatorImpl] Starting deduplication of a snapshot with 0 tasks. 
I1222 07:18:36.595 [pool-12-thread-1, SnapshotDeduplicator$SnapshotDeduplicatorImpl] Finished deduplicating snapshot. Deduplication ratio: 0/0 = NaN%. 
I1222 07:18:36.599218  4155 log.cpp:577] Attempting to append 6261 bytes to the log
I1222 07:18:36.599321  4155 coordinator.cpp:348] Coordinator attempting to write APPEND action at position 5
I1222 07:18:36.599573  4160 replica.cpp:537] Replica received write request for position 5 from (11)@192.168.33.7:8083
I1222 07:18:36.600693  4160 leveldb.cpp:341] Persisting action (6280 bytes) to leveldb took 1.061504ms
I1222 07:18:36.600733  4160 replica.cpp:712] Persisted action at 5
I1222 07:18:36.600844  4160 replica.cpp:691] Replica received learned notice for position 5 from @0.0.0.0:0
I1222 07:18:36.601541  4160 leveldb.cpp:341] Persisting action (6282 bytes) to leveldb took 619657ns
I1222 07:18:36.601569  4160 replica.cpp:712] Persisted action at 5
I1222 07:18:36.601582  4160 replica.cpp:697] Replica learned APPEND action at position 5
I1222 07:18:36.603044  4160 log.cpp:596] Attempting to truncate the log to 5
I1222 07:18:36.603173  4160 coordinator.cpp:348] Coordinator attempting to write TRUNCATE action at position 6
I1222 07:18:36.603330  4160 replica.cpp:537] Replica received write request for position 6 from (12)@192.168.33.7:8083
I1222 07:18:36.603890  4160 leveldb.cpp:341] Persisting action (16 bytes) to leveldb took 502063ns
I1222 07:18:36.603919  4160 replica.cpp:712] Persisted action at 6
I1222 07:18:36.603991  4160 replica.cpp:691] Replica received learned notice for position 6 from @0.0.0.0:0
I1222 07:18:36.604279  4160 leveldb.cpp:341] Persisting action (18 bytes) to leveldb took 217834ns
I1222 07:18:36.604357  4160 leveldb.cpp:399] Deleting ~3 keys from leveldb took 13629ns
I1222 07:18:36.604372  4160 replica.cpp:712] Persisted action at 6
I1222 07:18:36.604383  4160 replica.cpp:697] Replica learned TRUNCATE action at position 6
I1222 07:18:36.604581  4160 log.cpp:596] Attempting to truncate the log to 6
I1222 07:18:36.604682  4160 coordinator.cpp:348] Coordinator attempting to write TRUNCATE action at position 7
I1222 07:18:36.604778  4160 replica.cpp:537] Replica received write request for position 7 from (13)@192.168.33.7:8083
I1222 07:18:36.605317  4160 leveldb.cpp:341] Persisting action (16 bytes) to leveldb took 485544ns
I1222 07:18:36.605356  4160 replica.cpp:712] Persisted action at 7
I1222 07:18:36.605439  4160 replica.cpp:691] Replica received learned notice for position 7 from @0.0.0.0:0
I1222 07:18:36.605736  4160 leveldb.cpp:341] Persisting action (18 bytes) to leveldb took 215503ns
I1222 07:18:36.605773  4160 leveldb.cpp:399] Deleting ~3 keys from leveldb took 9620ns
I1222 07:18:36.605787  4160 replica.cpp:712] Persisted action at 7
I1222 07:18:36.605798  4160 replica.cpp:697] Replica learned TRUNCATE action at position 7
I1222 07:18:36.605 [pool-12-thread-1, LogStorage] Snapshot complete. host attrs: 1, cron jobs: 0, locks: 0, quota confs: 0, tasks: 0, updates: 0


Thanks,

David McLaughlin


Re: Review Request 54883: Move snapshots into a separate log

Posted by David McLaughlin <da...@dmclaughlin.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54883/
-----------------------------------------------------------

(Updated Dec. 22, 2016, 9:05 a.m.)


Review request for Aurora, Joshua Cohen, John Sirois, Santhosh Kumar Shanmugham, and Zameer Manji.


Changes
-------

Fix the doc to not be Twitter-internal.


Repository: aurora


Description (updated)
-------

See here for more details: https://docs.google.com/a/dmclaughlin.com/document/d/1OE7IVAAQ5vZlYQMWcNCmpyTHKHQZCxw_nqjLIlJe3Vo/edit?usp=sharing

The motivation for this patch is the realisation that if we store Snapshots outside of the main transaction log, we only ever to need to hold the storage lock during snapshot creation. This would reduce the time writes are blocked during snapshots by 80% for a large production cluster. I'm also doing this with a view to future work, which is described in the document above. 

As long as everything works fine, this change is as simple as adding two new args to the scheduler JVM. But it's worth noting that once this change is deployed and the first snapshot is written, you will no longer be able to rollback without first restoring from a backup.


Diffs (updated)
-----

  api/src/main/thrift/org/apache/aurora/gen/storage.thrift 9e4213f13255a182df938bea44ca87fa03a25318 
  examples/vagrant/aurorabuild.sh dbec54d3e677db8cb0f02849e5373e619629fc7c 
  examples/vagrant/upstart/aurora-scheduler.conf e68a801017ae02e0ed581129e12a645ccc1e35d4 
  src/main/java/org/apache/aurora/scheduler/app/SchedulerMain.java 43cc5b4645bc26b0fc6b23726ad3292699048ded 
  src/main/java/org/apache/aurora/scheduler/log/Log.java dc77eb435e5f8fdce56727a2f679e8e1907e977c 
  src/main/java/org/apache/aurora/scheduler/log/mesos/DualLogModule.java PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLog.java 21855e184fe20dc339713978b32344b6950701ec 
  src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLogStreamModule.java 6704a328a4023a178ed8f86ae4772cb04eb2fa8e 
  src/main/java/org/apache/aurora/scheduler/storage/log/LogManager.java cfa9c56c909bac2e885e33a9fe772cb41cbd9fa9 
  src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java 387350c7667a5fb8ee674ad0d3dd17529232b25b 
  src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotDeduplicator.java 9733ffe74b107f336858657550156ddb1f1dd215 
  src/main/java/org/apache/aurora/scheduler/storage/log/StreamManager.java ea147c0ba6aaa6d113144be0a8330bd2f73d2453 
  src/main/java/org/apache/aurora/scheduler/storage/log/StreamManagerImpl.java baf2647c54f1f9918139584e5151873a3853e83c 
  src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 29a3b4a57925f31e59a49d4bfa630e724cadcb04 
  src/test/java/org/apache/aurora/scheduler/log/mesos/MesosLogTest.java f142f545799d64f9352b0ac6c51942eedf5e9ced 
  src/test/java/org/apache/aurora/scheduler/storage/log/LogManagerTest.java 3f445595a81a5655c6c486791a9b55d8dc7f2f76 
  src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java 34c24aac9339e59c435925bd2357ce86998f5f02 

Diff: https://reviews.apache.org/r/54883/diff/


Testing
-------

I currently have only tested this on one node (wrote this code on a flight). Will have to verify this in our test clusters before shipping.


Example log output on my vagrant image:

I1222 07:18:36.471 [qtp1117266705-37, LoggingInterceptor] snapshot() 
I1222 07:18:36.479 [qtp1117266705-37, LogStorage] Creating snapshot at position: Position 6 
I1222 07:18:36.485 [qtp1117266705-37, SnapshotStoreImpl] Saving dbsnapshot 
I1222 07:18:36.591 [qtp1117266705-37, LogStorage] Write lock released. 
I1222 07:18:36.591 [pool-12-thread-1, LogStorage] Persisting snapshot. 
I1222 07:18:36.592 [pool-12-thread-1, SnapshotDeduplicator$SnapshotDeduplicatorImpl] Starting deduplication of a snapshot with 0 tasks. 
I1222 07:18:36.595 [pool-12-thread-1, SnapshotDeduplicator$SnapshotDeduplicatorImpl] Finished deduplicating snapshot. Deduplication ratio: 0/0 = NaN%. 
I1222 07:18:36.599218  4155 log.cpp:577] Attempting to append 6261 bytes to the log
I1222 07:18:36.599321  4155 coordinator.cpp:348] Coordinator attempting to write APPEND action at position 5
I1222 07:18:36.599573  4160 replica.cpp:537] Replica received write request for position 5 from (11)@192.168.33.7:8083
I1222 07:18:36.600693  4160 leveldb.cpp:341] Persisting action (6280 bytes) to leveldb took 1.061504ms
I1222 07:18:36.600733  4160 replica.cpp:712] Persisted action at 5
I1222 07:18:36.600844  4160 replica.cpp:691] Replica received learned notice for position 5 from @0.0.0.0:0
I1222 07:18:36.601541  4160 leveldb.cpp:341] Persisting action (6282 bytes) to leveldb took 619657ns
I1222 07:18:36.601569  4160 replica.cpp:712] Persisted action at 5
I1222 07:18:36.601582  4160 replica.cpp:697] Replica learned APPEND action at position 5
I1222 07:18:36.603044  4160 log.cpp:596] Attempting to truncate the log to 5
I1222 07:18:36.603173  4160 coordinator.cpp:348] Coordinator attempting to write TRUNCATE action at position 6
I1222 07:18:36.603330  4160 replica.cpp:537] Replica received write request for position 6 from (12)@192.168.33.7:8083
I1222 07:18:36.603890  4160 leveldb.cpp:341] Persisting action (16 bytes) to leveldb took 502063ns
I1222 07:18:36.603919  4160 replica.cpp:712] Persisted action at 6
I1222 07:18:36.603991  4160 replica.cpp:691] Replica received learned notice for position 6 from @0.0.0.0:0
I1222 07:18:36.604279  4160 leveldb.cpp:341] Persisting action (18 bytes) to leveldb took 217834ns
I1222 07:18:36.604357  4160 leveldb.cpp:399] Deleting ~3 keys from leveldb took 13629ns
I1222 07:18:36.604372  4160 replica.cpp:712] Persisted action at 6
I1222 07:18:36.604383  4160 replica.cpp:697] Replica learned TRUNCATE action at position 6
I1222 07:18:36.604581  4160 log.cpp:596] Attempting to truncate the log to 6
I1222 07:18:36.604682  4160 coordinator.cpp:348] Coordinator attempting to write TRUNCATE action at position 7
I1222 07:18:36.604778  4160 replica.cpp:537] Replica received write request for position 7 from (13)@192.168.33.7:8083
I1222 07:18:36.605317  4160 leveldb.cpp:341] Persisting action (16 bytes) to leveldb took 485544ns
I1222 07:18:36.605356  4160 replica.cpp:712] Persisted action at 7
I1222 07:18:36.605439  4160 replica.cpp:691] Replica received learned notice for position 7 from @0.0.0.0:0
I1222 07:18:36.605736  4160 leveldb.cpp:341] Persisting action (18 bytes) to leveldb took 215503ns
I1222 07:18:36.605773  4160 leveldb.cpp:399] Deleting ~3 keys from leveldb took 9620ns
I1222 07:18:36.605787  4160 replica.cpp:712] Persisted action at 7
I1222 07:18:36.605798  4160 replica.cpp:697] Replica learned TRUNCATE action at position 7
I1222 07:18:36.605 [pool-12-thread-1, LogStorage] Snapshot complete. host attrs: 1, cron jobs: 0, locks: 0, quota confs: 0, tasks: 0, updates: 0


Thanks,

David McLaughlin


Re: Review Request 54883: Move snapshots into a separate log

Posted by Aurora ReviewBot <wf...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54883/#review159933
-----------------------------------------------------------



Master (38b9311) is red with this patch.
  ./build-support/jenkins/build.sh


:generateBuildProperties
:processResources
:classes
:jar
:startScripts
:distTar
:distZip
:assemble
:compileJmhJavaNote: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/jmh/java/org/apache/aurora/benchmark/fakes/FakeSchedulerDriver.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.

:processJmhResources UP-TO-DATE
:jmhClasses
:checkstyleJmh
:jsHint
:checkstyleMain[ant:checkstyle] [ERROR] /home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/java/org/apache/aurora/scheduler/app/SchedulerMain.java:58:8: Unused import - org.apache.aurora.scheduler.log.mesos.MesosLogStreamModule. [UnusedImports]
[ant:checkstyle] [ERROR] /home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/java/org/apache/aurora/scheduler/log/mesos/DualLogModule.java:1: File contains a sequence of empty lines. [RegexpMultiline]
[ant:checkstyle] [ERROR] /home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/java/org/apache/aurora/scheduler/log/mesos/DualLogModule.java:1: Line does not match expected header line of '^\/\*\*$'. [RegexpHeader]
[ant:checkstyle] [ERROR] /home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/java/org/apache/aurora/scheduler/log/mesos/DualLogModule.java:34: File contains a sequence of empty lines. [RegexpMultiline]
[ant:checkstyle] [ERROR] /home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLog.java:22:8: Unused import - java.util.Objects. [UnusedImports]
[ant:checkstyle] [ERROR] /home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLog.java:347: File contains a sequence of empty lines. [RegexpMultiline]
[ant:checkstyle] [ERROR] /home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/java/org/apache/aurora/scheduler/storage/log/LogManager.java:31:15: Unused import - com.google.common.base.Preconditions.checkArgument. [UnusedImports]
[ant:checkstyle] [ERROR] /home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/java/org/apache/aurora/scheduler/storage/log/StreamManager.java:20:8: Unused import - org.apache.aurora.scheduler.log.Log. [UnusedImports]
[ant:checkstyle] [ERROR] /home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/java/org/apache/aurora/scheduler/storage/log/StreamManagerImpl.java:36:8: Unused import - org.apache.aurora.codec.ThriftBinaryCodec. [UnusedImports]
[ant:checkstyle] [ERROR] /home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/java/org/apache/aurora/scheduler/storage/log/StreamManagerImpl.java:106:78: '{' is not preceded with whitespace. [WhitespaceAround]
 FAILED

FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':checkstyleMain'.
> Checkstyle rule violations were found. See the report at: file:///home/jenkins/jenkins-slave/workspace/AuroraBot/dist/reports/checkstyle/main.html

* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output.

BUILD FAILED

Total time: 1 mins 15.169 secs


I will refresh this build result if you post a review containing "@ReviewBot retry"

- Aurora ReviewBot


On Dec. 22, 2016, 8:49 a.m., David McLaughlin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54883/
> -----------------------------------------------------------
> 
> (Updated Dec. 22, 2016, 8:49 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, John Sirois, Santhosh Kumar Shanmugham, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> See here for more details: https://docs.google.com/a/twitter.com/document/d/1BzJsx7rGAzJvIZnI-DSuVGD524tc7M2vPrH0HUORlfI/edit?usp=sharing
> 
> The motivation for this patch is the realisation that if we store Snapshots outside of the main transaction log, we only ever to need to hold the storage lock during snapshot creation. This would reduce the time writes are blocked during snapshots by 80% for a large production cluster. I'm also doing this with a view to future work, which is described in the document above. 
> 
> As long as everything works fine, this change is as simple as adding two new args to the scheduler JVM. But it's worth noting that once this change is deployed and the first snapshot is written, you will no longer be able to rollback without first restoring from a backup.
> 
> 
> Diffs
> -----
> 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 9e4213f13255a182df938bea44ca87fa03a25318 
>   examples/vagrant/aurorabuild.sh dbec54d3e677db8cb0f02849e5373e619629fc7c 
>   examples/vagrant/upstart/aurora-scheduler.conf e68a801017ae02e0ed581129e12a645ccc1e35d4 
>   src/main/java/org/apache/aurora/scheduler/app/SchedulerMain.java 43cc5b4645bc26b0fc6b23726ad3292699048ded 
>   src/main/java/org/apache/aurora/scheduler/log/Log.java dc77eb435e5f8fdce56727a2f679e8e1907e977c 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/DualLogModule.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLog.java 21855e184fe20dc339713978b32344b6950701ec 
>   src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLogStreamModule.java 6704a328a4023a178ed8f86ae4772cb04eb2fa8e 
>   src/main/java/org/apache/aurora/scheduler/storage/log/LogManager.java cfa9c56c909bac2e885e33a9fe772cb41cbd9fa9 
>   src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java 387350c7667a5fb8ee674ad0d3dd17529232b25b 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotDeduplicator.java 9733ffe74b107f336858657550156ddb1f1dd215 
>   src/main/java/org/apache/aurora/scheduler/storage/log/StreamManager.java ea147c0ba6aaa6d113144be0a8330bd2f73d2453 
>   src/main/java/org/apache/aurora/scheduler/storage/log/StreamManagerImpl.java baf2647c54f1f9918139584e5151873a3853e83c 
>   src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 29a3b4a57925f31e59a49d4bfa630e724cadcb04 
>   src/test/java/org/apache/aurora/scheduler/log/mesos/MesosLogTest.java f142f545799d64f9352b0ac6c51942eedf5e9ced 
>   src/test/java/org/apache/aurora/scheduler/storage/log/LogManagerTest.java 3f445595a81a5655c6c486791a9b55d8dc7f2f76 
>   src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java 34c24aac9339e59c435925bd2357ce86998f5f02 
> 
> Diff: https://reviews.apache.org/r/54883/diff/
> 
> 
> Testing
> -------
> 
> I currently have only tested this on one node (wrote this code on a flight). Will have to verify this in our test clusters before shipping.
> 
> 
> Example log output on my vagrant image:
> 
> I1222 07:18:36.471 [qtp1117266705-37, LoggingInterceptor] snapshot() 
> I1222 07:18:36.479 [qtp1117266705-37, LogStorage] Creating snapshot at position: Position 6 
> I1222 07:18:36.485 [qtp1117266705-37, SnapshotStoreImpl] Saving dbsnapshot 
> I1222 07:18:36.591 [qtp1117266705-37, LogStorage] Write lock released. 
> I1222 07:18:36.591 [pool-12-thread-1, LogStorage] Persisting snapshot. 
> I1222 07:18:36.592 [pool-12-thread-1, SnapshotDeduplicator$SnapshotDeduplicatorImpl] Starting deduplication of a snapshot with 0 tasks. 
> I1222 07:18:36.595 [pool-12-thread-1, SnapshotDeduplicator$SnapshotDeduplicatorImpl] Finished deduplicating snapshot. Deduplication ratio: 0/0 = NaN%. 
> I1222 07:18:36.599218  4155 log.cpp:577] Attempting to append 6261 bytes to the log
> I1222 07:18:36.599321  4155 coordinator.cpp:348] Coordinator attempting to write APPEND action at position 5
> I1222 07:18:36.599573  4160 replica.cpp:537] Replica received write request for position 5 from (11)@192.168.33.7:8083
> I1222 07:18:36.600693  4160 leveldb.cpp:341] Persisting action (6280 bytes) to leveldb took 1.061504ms
> I1222 07:18:36.600733  4160 replica.cpp:712] Persisted action at 5
> I1222 07:18:36.600844  4160 replica.cpp:691] Replica received learned notice for position 5 from @0.0.0.0:0
> I1222 07:18:36.601541  4160 leveldb.cpp:341] Persisting action (6282 bytes) to leveldb took 619657ns
> I1222 07:18:36.601569  4160 replica.cpp:712] Persisted action at 5
> I1222 07:18:36.601582  4160 replica.cpp:697] Replica learned APPEND action at position 5
> I1222 07:18:36.603044  4160 log.cpp:596] Attempting to truncate the log to 5
> I1222 07:18:36.603173  4160 coordinator.cpp:348] Coordinator attempting to write TRUNCATE action at position 6
> I1222 07:18:36.603330  4160 replica.cpp:537] Replica received write request for position 6 from (12)@192.168.33.7:8083
> I1222 07:18:36.603890  4160 leveldb.cpp:341] Persisting action (16 bytes) to leveldb took 502063ns
> I1222 07:18:36.603919  4160 replica.cpp:712] Persisted action at 6
> I1222 07:18:36.603991  4160 replica.cpp:691] Replica received learned notice for position 6 from @0.0.0.0:0
> I1222 07:18:36.604279  4160 leveldb.cpp:341] Persisting action (18 bytes) to leveldb took 217834ns
> I1222 07:18:36.604357  4160 leveldb.cpp:399] Deleting ~3 keys from leveldb took 13629ns
> I1222 07:18:36.604372  4160 replica.cpp:712] Persisted action at 6
> I1222 07:18:36.604383  4160 replica.cpp:697] Replica learned TRUNCATE action at position 6
> I1222 07:18:36.604581  4160 log.cpp:596] Attempting to truncate the log to 6
> I1222 07:18:36.604682  4160 coordinator.cpp:348] Coordinator attempting to write TRUNCATE action at position 7
> I1222 07:18:36.604778  4160 replica.cpp:537] Replica received write request for position 7 from (13)@192.168.33.7:8083
> I1222 07:18:36.605317  4160 leveldb.cpp:341] Persisting action (16 bytes) to leveldb took 485544ns
> I1222 07:18:36.605356  4160 replica.cpp:712] Persisted action at 7
> I1222 07:18:36.605439  4160 replica.cpp:691] Replica received learned notice for position 7 from @0.0.0.0:0
> I1222 07:18:36.605736  4160 leveldb.cpp:341] Persisting action (18 bytes) to leveldb took 215503ns
> I1222 07:18:36.605773  4160 leveldb.cpp:399] Deleting ~3 keys from leveldb took 9620ns
> I1222 07:18:36.605787  4160 replica.cpp:712] Persisted action at 7
> I1222 07:18:36.605798  4160 replica.cpp:697] Replica learned TRUNCATE action at position 7
> I1222 07:18:36.605 [pool-12-thread-1, LogStorage] Snapshot complete. host attrs: 1, cron jobs: 0, locks: 0, quota confs: 0, tasks: 0, updates: 0
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>