You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by GitBox <gi...@apache.org> on 2022/10/25 19:39:56 UTC

[GitHub] [accumulo] dlmarion opened a new issue, #3045: External Compaction stuck

dlmarion opened a new issue, #3045:
URL: https://github.com/apache/accumulo/issues/3045

   In testing 2.1.0-RC1, I found that an external compaction became "stuck". 
   
   From the tserver log:
   ```
   2022-10-25T14:57:33,815 [tablet.files] DEBUG: Compacting 1;1f5c28f5c28f5c58;1eb851eb851eb88 on e.q1 for SYSTEM from [C0000zjn.rf, C0001hht.rf, F0001hjq.rf, F0001hk2.rf, C000188q.rf, C0001hjd.rf, C0001c2z.rf, F0001hkd.rf, C0001eis.rf] size 497 MB
   2022-10-25T14:57:33,815 [compactions.CompactionManager] DEBUG: Reserved external compaction ECID:b6c0707e-d39b-4be8-a3d2-8a12485061d5
   2022-10-25T14:57:34,923 [compactions.CompactionManager] DEBUG: Attempting to reserve external compaction, queue:q1 priority:-32749 compactor:X.Y.Z.Z:9133
   ```
   
   From the compactor log:
   ```
   2022-10-25T14:57:33,816 [compactor.Compactor] DEBUG: Received next compaction job: TExternalCompactionJob(externalCompactionId:ECID:b6c0707e-d39b-4be8-a3d2-8a12485061d5, extent:TKeyExtent(table:31, endRow:31 66 35 63 32 38 66 35 63 32 38 66 35 63 35 38, prevEndRow:31 65 62 38 35 31 65 62 38 35 31 65 62 38 38),...
   2022-10-25T14:57:33,816 [compactor.Compactor] INFO : Starting up compaction runnable for job: TExternalCompactionJob(externalCompactionId:ECID:b6c0707e-d39b-4be8-a3d2-8a12485061d5, extent:TKeyExtent(table:31, endRow:31 66 35 63 32 38 66 35 63 32 38 66 35 63 35 38, prevEndRow:31 65 62 38 35 31 65 62 38 35 31 65 62 38 38)...
   ```
   and **then the compactor was killed by the agitator at 14:58**.
   
   From the coordinator log:
   ```
   2022-10-25T14:57:33,816 [coordinator.CompactionCoordinator] DEBUG: Returning external job ECID:b6c0707e-d39b-4be8-a3d2-8a12485061d5 to X.Y.Z.Z:9133
   2022-10-25T14:57:33,817 [coordinator.CompactionCoordinator] DEBUG: Compaction status update, id: ECID:b6c0707e-d39b-4be8-a3d2-8a12485061d5, timestamp: 1666709853817, update: TCompactionStatusUpdate(state:STARTED, message:Compaction started, entriesToBeCompacted:-1, entriesRead:-1, entriesWritten:-1)
   2022-10-25T15:00:35,363 [coordinator.DeadCompactionDetector] DEBUG: Possible dead compaction detected ECID:b6c0707e-d39b-4be8-a3d2-8a12485061d5 1;1f5c28f5c28f5c58;1eb851eb851eb88
   2022-10-25T15:05:35,380 [coordinator.DeadCompactionDetector] DEBUG: Possible dead compaction detected ECID:b6c0707e-d39b-4be8-a3d2-8a12485061d5 1;1f5c28f5c28f5c58;1eb851eb851eb88
   ```
   
   The message 'Possible dead compaction detected...` is emitted [here](https://github.com/apache/accumulo/blob/main/server/compaction-coordinator/src/main/java/org/apache/accumulo/coordinator/DeadCompactionDetector.java#L120) and then the compaction is killed [here](https://github.com/apache/accumulo/blob/main/server/compaction-coordinator/src/main/java/org/apache/accumulo/coordinator/DeadCompactionDetector.java#L134) if it happens more than twice. But it doesn't for some reason.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] dlmarion commented on issue #3045: External Compaction stuck

Posted by GitBox <gi...@apache.org>.
dlmarion commented on issue #3045:
URL: https://github.com/apache/accumulo/issues/3045#issuecomment-1297044101

   > I don't see anything ATM that makes me think this will impact the correctness of external compactions.
   
   Agreed, I didn't see any issues because of this orphan external compaction in the RUNNING set. Restarting the Coordinator removed it from Monitor.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] ctubbsii commented on issue #3045: External Compaction stuck

Posted by GitBox <gi...@apache.org>.
ctubbsii commented on issue #3045:
URL: https://github.com/apache/accumulo/issues/3045#issuecomment-1292461121

   I think it is likely this was fixed by #3049. I will close it, but can reopen if it occurs again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] ctubbsii commented on issue #3045: External Compaction stuck

Posted by GitBox <gi...@apache.org>.
ctubbsii commented on issue #3045:
URL: https://github.com/apache/accumulo/issues/3045#issuecomment-1295516142

   @dlmarion Do you think this is a blocker for 2.1?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] EdColeman commented on issue #3045: External Compaction stuck

Posted by GitBox <gi...@apache.org>.
EdColeman commented on issue #3045:
URL: https://github.com/apache/accumulo/issues/3045#issuecomment-1294952302

   Could this be triggered by the compactor not recovering from an IOException?  With the current run - the compation is shown as running for a long time - but I think the compactor actually died.
   
   ```
   2022-10-28T06:46:05,732 [compactor.Compactor] INFO : Starting up compaction runnable for job: TExternalCompactionJob(externalCompactionId:ECID:7165d849-502f-4dd2-9d29-5530ff7874f2, extent:TKeyExtent(table:31, endRow:35 35 30
    61 34 37, prevEndRow:35 34 66 35 63 63), files:[InputFile(metadataFileEntry:hdfs://10.113.15.70:8000/accumulo/tables/1/t-000f7qh/C001iqkw.rf, size:79216171, entries:1993712, timestamp:-1), InputFile(metadataFileEntry:hdfs:/
   /10.113.15.70:8000/accumulo/tables/1/t-000f7qh/F001jops.rf, size:430831, entries:10936, timestamp:-1), InputFile(metadataFileEntry:hdfs://10.113.15.70:8000/accumulo/tables/1/t-000f7qh/C001jmbx.rf, size:1581904, entries:47271
   , timestamp:-1), InputFile(metadataFileEntry:hdfs://10.113.15.70:8000/accumulo/tables/1/t-000f7qh/C0016gno.rf, size:77565257, entries:1994277, timestamp:-1), InputFile(metadataFileEntry:hdfs://10.113.15.70:8000/accumulo/tabl
   es/1/t-000f7qh/C001jdog.rf, size:7055973, entries:190899, timestamp:-1), InputFile(metadataFileEntry:hdfs://10.113.15.70:8000/accumulo/tables/1/t-000f7qh/F001jmfa.rf, size:412293, entries:11682, timestamp:-1), InputFile(meta
   dataFileEntry:hdfs://10.113.15.70:8000/accumulo/tables/1/t-000f7qh/C001ctsl.rf, size:83113000, entries:2114094, timestamp:-1)], iteratorSettings:IteratorConfig(iterators:[]), outputFile:hdfs://10.113.15.70:8000/accumulo/tabl
   es/1/t-000f7qh/C001jqxc.rf_tmp, propagateDeletes:true, kind:SYSTEM, userCompactionId:0, overrides:{})
   2022-10-28T06:46:05,733 [compactor.Compactor] DEBUG: Progress checks will occur every 23 seconds
   2022-10-28T06:46:08,896 [server.GarbageCollectionLogger] DEBUG: gc G1 Young Generation=47.72(+0.01) secs G1 Old Generation=0.00(+0.00) secs freemem=109,708,624(-18,760,296) totalmem=268,435,456
   2022-10-28T06:46:13,897 [server.GarbageCollectionLogger] DEBUG: gc G1 Young Generation=47.73(+0.01) secs G1 Old Generation=0.00(+0.00) secs freemem=157,836,976(+29,368,056) totalmem=268,435,456
   2022-10-28T06:46:18,897 [server.GarbageCollectionLogger] DEBUG: gc G1 Young Generation=47.73(+0.01) secs G1 Old Generation=0.00(+0.00) secs freemem=69,887,424(-58,581,496) totalmem=268,435,456
   2022-10-28T06:46:23,897 [server.GarbageCollectionLogger] DEBUG: gc G1 Young Generation=47.74(+0.01) secs G1 Old Generation=0.00(+0.00) secs freemem=122,355,152(-6,113,768) totalmem=268,435,456
   2022-10-28T06:46:28,734 [compactor.Compactor] DEBUG: Updating coordinator with compaction progress: Compaction in progress, read 4110336 of 6362871 input entries ( 64.59876 % ), written 4099072 entries.
   2022-10-28T06:46:28,897 [server.GarbageCollectionLogger] DEBUG: gc G1 Young Generation=47.75(+0.01) secs G1 Old Generation=0.00(+0.00) secs freemem=156,882,320(+28,413,400) totalmem=268,435,456
   2022-10-28T06:46:33,514 [compaction.FileCompactor] WARN : Failed to close map file
   java.io.IOException: Filesystem closed
           at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:494) ~[hadoop-client-api-3.3.4.jar:?]
           at org.apache.hadoop.hdfs.DFSInputStream.close(DFSInputStream.java:768) ~[hadoop-client-api-3.3.4.jar:?]
           at java.io.FilterInputStream.close(FilterInputStream.java:180) ~[?:?]
           at java.io.FilterInputStream.close(FilterInputStream.java:180) ~[?:?]
           at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.close(CachableBlockFile.java:464) ~[accumulo-core-2.1.0.jar:2.1.0]
           at org.apache.accumulo.core.file.rfile.RFile$Reader.close(RFile.java:1299) ~[accumulo-core-2.1.0.jar:2.1.0]
           at org.apache.accumulo.server.compaction.FileCompactor.compactLocalityGroup(FileCompactor.java:417) ~[accumulo-server-base-2.1.0.jar:2.1.0]
           at org.apache.accumulo.server.compaction.FileCompactor.call(FileCompactor.java:234) ~[accumulo-server-base-2.1.0.jar:2.1.0]
           at org.apache.accumulo.compactor.Compactor.lambda$createCompactionJob$7(Compactor.java:569) ~[accumulo-compactor-2.1.0.jar:2.1.0]
           at org.apache.accumulo.core.trace.TraceWrappedRunnable.run(TraceWrappedRunnable.java:52) ~[accumulo-core-2.1.0.jar:2.1.0]
           at java.lang.Thread.run(Thread.java:829) ~[?:?]
   ~
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] dlmarion commented on issue #3045: External Compaction stuck

Posted by GitBox <gi...@apache.org>.
dlmarion commented on issue #3045:
URL: https://github.com/apache/accumulo/issues/3045#issuecomment-1294983940

   @keith-turner - would like to hear your thoughts here. Also, I can bounce the CompactionCoordinator to test that it resolves the issue. But, I'll wait to hear something from you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] dlmarion commented on issue #3045: External Compaction stuck

Posted by GitBox <gi...@apache.org>.
dlmarion commented on issue #3045:
URL: https://github.com/apache/accumulo/issues/3045#issuecomment-1294982326

   The Compactor process [watches](https://github.com/apache/accumulo/blob/6099e81c891195caa7f54a4f4e96cf9fd10afef0/server/compactor/src/main/java/org/apache/accumulo/compactor/Compactor.java#L206) for a tablet deletion or split, or if the compaction has been canceled. In this case, the Compactor was dead, so it could not do that.
   
   The Coordinator uses the [DeadCompactionDetector](https://github.com/apache/accumulo/blob/main/server/compaction-coordinator/src/main/java/org/apache/accumulo/coordinator/DeadCompactionDetector.java) to look for failed external compactions, and then cancels them via the Coordinator.
   
   I'm wondering if the external compaction information does not get retained on a tablet split or something...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] asfgit closed issue #3045: External Compaction stuck

Posted by GitBox <gi...@apache.org>.
asfgit closed issue #3045: External Compaction stuck
URL: https://github.com/apache/accumulo/issues/3045


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] ctubbsii commented on issue #3045: External Compaction stuck

Posted by GitBox <gi...@apache.org>.
ctubbsii commented on issue #3045:
URL: https://github.com/apache/accumulo/issues/3045#issuecomment-1291430448

   I will create a 2.1.0-RC2 without a fix for this, but I'm still tracking this for 2.1.0 because it may still end up blocking a release. If it doesn't, we can bump it off to 2.1.1.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] keith-turner commented on issue #3045: External Compaction stuck

Posted by GitBox <gi...@apache.org>.
keith-turner commented on issue #3045:
URL: https://github.com/apache/accumulo/issues/3045#issuecomment-1292371525

   I looked around and user compactions do use the code that was broken in #3044.   Not sure if that was causing this, but it certainly seems like it could cause problems for a user compaction.
   
   https://github.com/apache/accumulo/blob/d0d7b585ea7bace2f86d5168058aaf5f33eda69c/server/manager/src/main/java/org/apache/accumulo/manager/tableOps/compact/CompactionDriver.java#L107


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] keith-turner commented on issue #3045: External Compaction stuck

Posted by GitBox <gi...@apache.org>.
keith-turner commented on issue #3045:
URL: https://github.com/apache/accumulo/issues/3045#issuecomment-1297061802

   >  If you think it's critical enough to block the release, I am happy to withdraw it, though.
   
   I don't think this issue should block the release.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] dlmarion commented on issue #3045: External Compaction stuck

Posted by GitBox <gi...@apache.org>.
dlmarion commented on issue #3045:
URL: https://github.com/apache/accumulo/issues/3045#issuecomment-1295522646

   > I can look at this Monday. Is the external compaction showing up in the monitor, but there is nothing present for it in the metadata table? Is that the problem you are seeing?
   
   Yes. I think the compaction is still in the RUNNING set inside the compaction coordinator, it didn't get canceled due to the issues I mentioned above. So, basically the accounting is off.
   
   > Do you think this is a blocker for 2.1?
   
   No. I just restarted *only* the compaction coordinator and the "stuck" (orphaned rather) external compaction is gone.
   
   
   The only other thing to note is that when I stopped the compaction coordinator, the Monitor page displayed an error dialogue box that said:
   ```
   DataTables warning: table id=runningTable - Ajax error. For more information about this error, please see http://datatables.net/tn/7
   ```
   Even after restarting the compaction coordinator the boxes kept re-appearing for a little while.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] ctubbsii closed issue #3045: External Compaction stuck

Posted by GitBox <gi...@apache.org>.
ctubbsii closed issue #3045: External Compaction stuck
URL: https://github.com/apache/accumulo/issues/3045


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] ctubbsii commented on issue #3045: External Compaction stuck

Posted by GitBox <gi...@apache.org>.
ctubbsii commented on issue #3045:
URL: https://github.com/apache/accumulo/issues/3045#issuecomment-1292348681

   Could this be related to the Ample bug fixed in #3049 ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] keith-turner commented on issue #3045: External Compaction stuck

Posted by GitBox <gi...@apache.org>.
keith-turner commented on issue #3045:
URL: https://github.com/apache/accumulo/issues/3045#issuecomment-1296986881

   @dlmarion I have been looking over the coordinator code.  There are some race conditions w/ the RUNNING set.   The RUNNING set is an informational cache of what might be running, but by the time things are added to it they may not actually be running.    Looking at the code, it seems like this would mostly impact the monitor and ecadmin tool.  I think the code can be restructured to avoid these race conditions, I am going to work on doing that restructuring and submit a PR.
   
   @ctubbsii I saw you moved this to 2.1.1.  I was not sure about that, but it feels like it may be the right thing to do.  I think I could have a fix ready in a day or two if you were interested in waiting, but we will find more bugs in 2.1.0 and will need to release a 2.1.1 anyway.  Also, I don't see anything ATM that makes me think this will impact the correctness of external compactions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] keith-turner commented on issue #3045: External Compaction stuck

Posted by GitBox <gi...@apache.org>.
keith-turner commented on issue #3045:
URL: https://github.com/apache/accumulo/issues/3045#issuecomment-1295510589

   @dlmarion  I can look at this Monday.   Is the external compaction showing up in the monitor, but there is nothing present for it in the metadata table?  Is that the problem you are seeing?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] ctubbsii commented on issue #3045: External Compaction stuck

Posted by GitBox <gi...@apache.org>.
ctubbsii commented on issue #3045:
URL: https://github.com/apache/accumulo/issues/3045#issuecomment-1297005275

   > @ctubbsii I saw you moved this to 2.1.1.
   
   Yeah, I'm not sure which one it's going to land in. I wasn't going to withdraw the RC4 for this, based on @dlmarion's response to my question about whether he thought it was a blocker, so it'll depend on the outcome of the vote. If you think it's critical enough to block the release, I am happy to withdraw it, though. Otherwise, I think it can be listed in a "known issues" section of the release notes for anybody who might be interested in using external compactions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] dlmarion commented on issue #3045: External Compaction stuck

Posted by GitBox <gi...@apache.org>.
dlmarion commented on issue #3045:
URL: https://github.com/apache/accumulo/issues/3045#issuecomment-1294895698

   Re-opening this, I'm seeing another case of this with RC4. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] dlmarion commented on issue #3045: External Compaction stuck

Posted by GitBox <gi...@apache.org>.
dlmarion commented on issue #3045:
URL: https://github.com/apache/accumulo/issues/3045#issuecomment-1294964833

   So, as happened previously, the Compactor that was running the external compaction was killed by the agitator. There is nothing in the metadata table referencing this external compaction. Here are the relevant logs:
   
   ### Tablet Server
   ```
   2022-10-28T06:46:05,718 [compactions.CompactionManager] DEBUG: Attempting to reserve external compaction, queue:q1 priority:-32753 compactor:a.b.c.d:9133
   2022-10-28T06:46:05,718 [threads.ThreadPools] DEBUG: Creating ScheduledThreadPoolExecutor for GeneralExecutor with 1 threads
   2022-10-28T06:46:05,718 [threads.ThreadPools] DEBUG: Creating ThreadPoolExecutor for org.apache.accumulo.core.clientImpl.TabletServerBatchWriter$MutationWriter with 3 core threads and 3 max threads 180000 MILLISECONDS timeout
   2022-10-28T06:46:05,718 [threads.ThreadPools] DEBUG: Creating ThreadPoolExecutor for BinMutations with 1 core threads and 1 max threads 180000 MILLISECONDS timeout
   2022-10-28T06:46:05,729 [tablet.files] DEBUG: Compacting 1;550a47;54f5cc on e.q1 for SYSTEM from [C0016gno.rf, C001iqkw.rf, F001jmfa.rf, C001ctsl.rf, F001jops.rf, C001jmbx.rf, C001jdog.rf] size 237 MB
   2022-10-28T06:46:05,729 [compactions.CompactionManager] DEBUG: Reserved external compaction ECID:7165d849-502f-4dd2-9d29-5530ff7874f2
   2022-10-28T06:46:12,921 [tablet.Tablet] DEBUG: Waiting to completeClose for 1;550a47;54f5cc. 2 writes 0 scans
   2022-10-28T06:46:13,033 [tablet.location] DEBUG: Split 1;550a47;54f5cc into 1;55000a;54f5cc and 1;550a47;55000a on 10.113.15.203:9997[100000e72e8009f]
   2022-10-28T06:46:13,033 [tablet.Tablet] DEBUG: offline split time :   0.20 secs
   2022-10-28T06:46:13,033 [tserver.TabletServer] INFO : Starting split: 1;550a47;54f5cc
   2022-10-28T06:46:13,034 [tserver.TabletServer] INFO : Tablet split: 1;550a47;54f5cc size0 536996303 size1 536996309 time 240ms
   ```
   
   ### Compaction Coordinator
   ```
   2022-10-28T06:46:05,731 [coordinator.CompactionCoordinator] DEBUG: Returning external job ECID:7165d849-502f-4dd2-9d29-5530ff7874f2 to a.b.c.d:9133
   2022-10-28T06:46:05,732 [coordinator.CompactionCoordinator] DEBUG: Compaction status update, id: ECID:7165d849-502f-4dd2-9d29-5530ff7874f2, timestamp: 1666939565733, update: TCompactionStatusUpdate(state:STARTED, message:Compaction started, entriesToBeCompacted:-1, entriesRead:-1, entriesWritten:-1)
   2022-10-28T06:46:28,734 [coordinator.CompactionCoordinator] DEBUG: Compaction status update, id: ECID:7165d849-502f-4dd2-9d29-5530ff7874f2, timestamp: 1666939588734, update: TCompactionStatusUpdate(state:IN_PROGRESS, message:Compaction in progress, read 4110336 of 6362871 input entries ( 64.59876 % ), written 4099072 entries, entriesToBeCompacted:6362871, entriesRead:4110336, entriesWritten:4099072)
   ```
   
   ### Compactor Log
   ```
   2022-10-28T06:46:05,732 [compactor.Compactor] INFO : Starting up compaction runnable for job: TExternalCompactionJob(externalCompactionId:ECID:7165d849-502f-4dd2-9d29-5530ff7874f2, extent:TKeyExtent(table:31, endRow:35 35 30 61 34 37, prevEndRow:35 34 66 35 63 63), files:...
   2022-10-28T06:46:05,732 [compactor.Compactor] DEBUG: Received next compaction job: TExternalCompactionJob(externalCompactionId:ECID:7165d849-502f-4dd2-9d29-5530ff7874f2, extent:TKeyExtent(table:31, endRow:35 35 30 61 34 37, prevEndRow:35 34 66 35 63 63), files:...
   2022-10-28T06:46:05,732 [compactor.Compactor] INFO : Starting up compaction runnable for job: TExternalCompactionJob(externalCompactionId:ECID:7165d849-502f-4dd2-9d29-5530ff7874f2, extent:TKeyExtent(table:31, endRow:35 35 30 61 34 37, prevEndRow:35 34 66 35 63 63), files:...
   2022-10-28T06:46:05,733 [compactor.Compactor] DEBUG: Progress checks will occur every 23 seconds
   2022-10-28T06:46:28,734 [compactor.Compactor] DEBUG: Updating coordinator with compaction progress: Compaction in progress, read 4110336 of 6362871 input entries ( 64.59876 % ), written 4099072 entries.
   2022-10-28T06:46:28,897 [server.GarbageCollectionLogger] DEBUG: gc G1 Young Generation=47.75(+0.01) secs G1 Old Generation=0.00(+0.00) secs freemem=156,882,320(+28,413,400) totalmem=268,435,456
   2022-10-28T06:46:33,514 [compaction.FileCompactor] WARN : Failed to close map file
   java.io.IOException: Filesystem closed
           at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:494) ~[hadoop-client-api-3.3.4.jar:?]
           at org.apache.hadoop.hdfs.DFSInputStream.close(DFSInputStream.java:768) ~[hadoop-client-api-3.3.4.jar:?]
           at java.io.FilterInputStream.close(FilterInputStream.java:180) ~[?:?]
           at java.io.FilterInputStream.close(FilterInputStream.java:180) ~[?:?]
           at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.close(CachableBlockFile.java:464) ~[accumulo-core-2.1.0.jar:2.1.0]
           at org.apache.accumulo.core.file.rfile.RFile$Reader.close(RFile.java:1299) ~[accumulo-core-2.1.0.jar:2.1.0]
           at org.apache.accumulo.server.compaction.FileCompactor.compactLocalityGroup(FileCompactor.java:417) ~[accumulo-server-base-2.1.0.jar:2.1.0]
           at org.apache.accumulo.server.compaction.FileCompactor.call(FileCompactor.java:234) ~[accumulo-server-base-2.1.0.jar:2.1.0]
           at org.apache.accumulo.compactor.Compactor.lambda$createCompactionJob$7(Compactor.java:569) ~[accumulo-compactor-2.1.0.jar:2.1.0]
           at org.apache.accumulo.core.trace.TraceWrappedRunnable.run(TraceWrappedRunnable.java:52) ~[accumulo-core-2.1.0.jar:2.1.0]
           at java.lang.Thread.run(Thread.java:829) ~[?:?]
   ```
   
   ### Compactor Agitator log
   ```
   20221028 06:46:32 Killing compactor at a.b.c.d
   ```
   
   It appears that the ExternalCompaction started (06:46:05), the tablet was split (06:46:13), and the compactor killed (06:46:32) in a short time span. I believe that restarting the compaction coordinator would resolve the issue of the external compaction displaying in the monitor. Also, we have an [IT](https://github.com/apache/accumulo/blob/main/test/src/main/java/org/apache/accumulo/test/compaction/ExternalCompaction_2_IT.java#L94) that tests that an external compaction is cancelled when a tablet is split, so I think this is an edge case that we have not handled.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org