You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-commits@jackrabbit.apache.org by md...@apache.org on 2016/12/01 13:31:48 UTC

svn commit: r1772200 - /jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/segment/overview.md

Author: mduerig
Date: Thu Dec  1 13:31:48 2016
New Revision: 1772200

URL: http://svn.apache.org/viewvc?rev=1772200&view=rev
Log:
OAK-4292: Document Oak segment-tar
Wording, typo

Modified:
    jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/segment/overview.md

Modified: jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/segment/overview.md
URL: http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/segment/overview.md?rev=1772200&r1=1772199&r2=1772200&view=diff
==============================================================================
--- jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/segment/overview.md (original)
+++ jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/segment/overview.md Thu Dec  1 13:31:48 2016
@@ -62,7 +62,7 @@ When garbage collection runs, a second g
 As soon as the second generation is in place, data from the first generation that is still used by the user is copied over to the second generation.
 From this moment on, new data will be assigned to the second generation.
 Now the system contains data from the first and the second generation, but only data from the second generation is used.
-The compaction algorithm can now remove every piece of data from the first generation.
+The garbage collector can now remove every piece of data from the first generation.
 This removal is safe, because every piece of data that is still in use was copied to the second generation when garbage collection started.
 
 The process of creating a new generation, migrating data to the new generation and removing an old generation is usually referred to as a "garbage collection cycle".
@@ -74,8 +74,8 @@ While the previous section describes the
 Oak Segment Tar splits the garbage collection process in three phases: estimation, compaction and cleanup.
 
 Estimation is the first phase of garbage collection.
-In this phase, the system checks how much garbage is actually present in the system.
-If there is not enough garbage to justify the creation of a new generation, this phase is responsible of blocking the rest of the garbage collection process.
+In this phase, the system estimates how much garbage is actually present in the system.
+If there is not enough garbage to justify the creation of a new generation, the rest of the garbage collection process is skipped.
 If the output of this phase reports that the amount of garbage is beyond a certain threshold, the system creates a new generation and goes on with the next phase.
 
 Compaction executes after a new generation is created.
@@ -125,8 +125,9 @@ To make the examples clear, some informa
 These information depend on the configuration of your logging framework.
 Moreover, some of those messages contain data that can and will change from one execution to the other.
 
-Every log message generated during the garbage collection process always print the number of the new generation that is being created as part of garbage collection.
-The generation is always printed at the beginning of the message like in the following example.
+Every log message generated during the garbage collection process includes a sequence number 
+indicating how many times garbage collection ran since the system started.
+The sequence number is always printed at the beginning of the message like in the following example.
 
 ```
 TarMK GC #2: ...
@@ -156,7 +157,7 @@ The estimation phase can be disabled by
 TarMK GC #2: estimation skipped because it was explicitly disabled
 ```
 
-Estimation can also be disabled because garbage collection is disabled as a whole. In this case, the following message is printed instead.
+Estimation is also skipped when compaction is disabled on the system. In this case, the following message is printed instead.
 
 ```
 TarMK GC #2: estimation skipped because compaction is paused
@@ -178,7 +179,7 @@ In each of these cases, the reason why e
 
 ##### <a name="when-did-estimation-complete"/> When did estimation complete?
 
-When estimation terminates, either because of external cancellation or after a successful execution, the following messge is printed.
+When estimation terminates, either because of external cancellation or after a successful execution, the following message is printed.
 
 ```
 TarMK GC #2: estimation completed in 961.8 μs (0 ms). ${RESULT}
@@ -228,7 +229,7 @@ When compaction complete successfully, t
 TarMK GC #2: compaction succeeded in 6.580 min (394828 ms), after 2 cycles
 ```
 
-The time showed my the log message is relative to the compaction phase only.
+The time shown in the log message is relative to the compaction phase only.
 The reference to the amount of cycles spent for the compaction phase is explained in more detail below.
 If compaction did not complete successfully, the following message is printed instead.
 
@@ -262,7 +263,7 @@ When compaction first tries to setup the
 TarMK GC #2: compaction cycle 0 completed in 6.580 min (394828 ms). Compacted 3e3b35d3-2a15-43bc-a422-7bd4741d97a5.0000002a to 348b9500-0d67-46c5-a683-3ea8b0e6c21c.000012c0
 ```
 
-The message shows how long did it take to compact the data to the new generation.
+The message shows how long it took to compact the data to the new generation.
 It also prints the record identifiers of the two head states.
 The head state on the left belongs to the previous generation, the one on the right to the new.
 
@@ -294,8 +295,8 @@ TarMK GC #2: compaction gave up compacti
 The message means that compaction tried to compact the repository data to the new generation for five times, but every time there were concurrent changes that prevented compaction from completion.
 To prevent the system from being too overloaded with background activity, compaction stopped itself after the configured amount of cycles.
 
-The system can also be configured to obtain exclusive control of the system and force compaction to complete.
-This means that if compaction would give up after the configured amount of cycles, it would instead take full control of the repository and block concurrent writes.
+At this point the system can be configured to obtain exclusive access of the system and force compaction to complete.
+This means that if compaction gave up after the configured number of cycles, it would take full control over the repository and block concurrent writes.
 If the system is configured to behave this way, the following message is printed.
 
 ```
@@ -305,7 +306,7 @@ TarMK GC #2: trying to force compact rem
 If, after taking exclusive control of the repository for the specified amount of time, compaction completes successfully, the following message will be printed.
 
 ```
-TarMK GC #2: compaction succeeded to force compact remaining commits after 6.580 min (394828 ms).
+TarMK GC #2: compaction succeeded to force compact remaining commits after 56.7 s (56722 ms).
 ```
 
 Sometimes the amount of time allocated to the compaction phase in exclusive mode is not enough.